Hdfs Working directory usage

2019-01-31 Thread ketan dikshit
Hi Team,

We have a lot of data accumulated in our hdfs-working-directory, so we want to 
understand the usage of the following job data, once the job has been completed 
and segment is successfully created. 

cuboid
fact_distinct_columns
hfile
rowkey_stats

Basically I need to understand the purpose of: 
cuboid,fact_distinct_columns,hfile,rowkey_stats after the job has built the 
cube segment (assuming we don’t use and merging/automerging of segments on the 
cube later).

The space taken up by these data in hdfs-working-dir is quite huge(affecting 
our costing), and is not getting cleaned by by cleanup 
job(org.apache.kylin.tool.StorageCleanupJob). So we need to be understand, that 
if we manually clean this up we will not get any issues later.

Thanks,
Ketan@Exponential

Re:转发:退订

2019-01-31 Thread 天涯
退订

-- 原始邮件 --
发件人:"徐丹" 

Re: [New Document] Kylin SQL reference

2019-01-31 Thread Iñigo Martínez
Thank you, Shaofeng.

Our BI team is really thankful for this documentation.

El mié., 30 ene. 2019 a las 16:13, ShaoFeng Shi ()
escribió:

> Hello Kylin users,
>
> A new document is added to Apache Kylin website for introducing the SQL
> grammar, functions and data types that Kylin supports; We believe it will
> help new users. Many thanks to Na Zhai, who drafted this doc and verified
> the sample queries.
>
> Here is the link:
>
> English:
> https://kylin.apache.org/docs/tutorial/sql_reference.html
>
> Chinese:
> https://kylin.apache.org/cn/docs/tutorial/sql_reference.html
>
> Best regards,
>
> Shaofeng Shi 史少锋
> Apache Kylin PMC
> Work email: shaofeng@kyligence.io
> Kyligence Inc: https://kyligence.io/
>
> Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
> Join Kylin user mail group: user-subscr...@kylin.apache.org
> Join Kylin dev mail group: dev-subscr...@kylin.apache.org
>
>
>

-- 




Iñigo Martínez
SYSTEMS MANAGER
imarti...@telecoming.com 







[image: Mobile World Congress]

Paseo de la Castellana, 95. Torre Europa, pl 16. 28046 Madrid, Spain |
telecoming.com 



  Este correo electrónico y sus archivos adjuntos están dirigidos
únicamente a la(s) dirección(es) indicada(s) anteriormente. El carácter
confidencial, personal e intransferible del mismo está protegido
legalmente. Cualquier publicación, reproducción, distribución o
retransmisión no autorizada, ya sea completa o en parte, se encuentra
prohibida. Si ha recibido este mensaje por equivocación, notifíquelo
inmediatamente a la persona que lo ha enviado y borre el mensaje original
junto con sus ficheros anexos sin leerlo ni grabarlo en modo alguno.


转发:退订

2019-01-31 Thread 徐丹
- 转发的邮件 -

发件人: 徐丹
发送日期: 2018年12月24日 11:46
收件人: dev@kylin.apache.org
抄送人:
主题: 退订
请帮忙退订,谢谢!
x_u...@163.com




在2018年12月24日 08:08,许帅 写道:

帮忙退订下,谢谢

| |
许帅
|
|
邮箱:xushuai9...@163.com
|

签名由 网易邮箱大师 定制

[jira] [Created] (KYLIN-3802) Kylin build process fails at step "Materialize Hive View in Lookup Tables"

2019-01-31 Thread JIRA
Iñigo Martinez created KYLIN-3802:
-

 Summary: Kylin build process fails at step "Materialize Hive View 
in Lookup Tables"
 Key: KYLIN-3802
 URL: https://issues.apache.org/jira/browse/KYLIN-3802
 Project: Kylin
  Issue Type: Bug
  Components: Job Engine
Affects Versions: v2.4.1
Reporter: Iñigo Martinez


When building several cubes in parallel that use a hive view as source, 
sometimes they fail at step two (Materialize Hive View in Lookup Tables).

This is due to since we are using a view as source instead of a table, Kylin 
intermediate table is created without UUID and so second build drop previous 
built intermediate table. 

For example:
{code:java}
0: jdbc:hive2://bi-horton-hive.internalserver> DROP TABLE IF EXISTS 
kylin_intermediate_DW_DI_OPERADORES_SMS_VIEW; No rows affected (0.586 seconds)

0: jdbc:hive2://bi-horton-hive.internalserver> CREATE EXTERNAL TABLE IF NOT 
EXISTS kylin_intermediate_DW_DI_OPERADORES_SMS_VIEW LIKE 
DW.DI_OPERADORES_SMS_VIEW LOCATION 'hdfs://XX/kylin/kylin_me 
tadata/kylin-5df95c88-a123-44e0-9b1a-c35ecf1599fb/kylin_intermediate_DW_DI_OPERADORES_SMS_VIEW';
 No rows affected (0.308 seconds) 

0: jdbc:hive2://bi-horton-hive.internalserver> ALTER TABLE 
kylin_intermediate_DW_DI_OPERADORES_SMS_VIEW SET 
TBLPROPERTIES('auto.purge'='true'); Error: Error while processing statement: 
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask. Table not found 
kylin_intermediate_DW_DI_OPERADORES_SMS_VIEW (state=08S01,code=1)
{code}
 

If no view are used, Kylin appends an uuid to avoid conflicts.
{code:java}
DROP TABLE IF EXISTS 
kylin_intermediate_agencia_cubo_v4_4b6d70dd_e0e4_4247_8949_0adef5c0d6c4;
CREATE EXTERNAL TABLE IF NOT EXISTS 
kylin_intermediate_agencia_cubo_v4_4b6d70dd_e0e4_4247_8949_0adef5c0d6c4{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3801) find-hive-dependency.sh fail to grep env:CLASSPATH from beeline output

2019-01-31 Thread Nikodimos Nikolaidis (JIRA)
Nikodimos Nikolaidis created KYLIN-3801:
---

 Summary: find-hive-dependency.sh fail to grep env:CLASSPATH from 
beeline output
 Key: KYLIN-3801
 URL: https://issues.apache.org/jira/browse/KYLIN-3801
 Project: Kylin
  Issue Type: Bug
  Components: Tools, Build and Test
Reporter: Nikodimos Nikolaidis


In a Debian stretch system with GNU grep version 2.27, whenever 
bin/find-hive-dependency.sh is executed, with beeline enabled, the following 
error message is produced:
{noformat}
Retrieving hive dependency...
./find-hive-dependency.sh: line 40: [: too many arguments
Couldn't find hive configuration directory. Please set HIVE_CONF to the path 
which contains hive-site.xml.{noformat}
In line 34, output format of beeline is defined as dsv, which is something that 
grep thinks is binary data - although it's text - and it leads to
{code:java}
hive_env='Binary file (standard input) matches'{code}
instead of correct env:CLASSPATH grepping, which causes the above error. One 
solution would be to set the flag '-text' of grep to force processing beeline 
output as text.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3800) Count distinct result is incorrect

2019-01-31 Thread Chao Long (JIRA)
Chao Long created KYLIN-3800:


 Summary: Count distinct result is incorrect
 Key: KYLIN-3800
 URL: https://issues.apache.org/jira/browse/KYLIN-3800
 Project: Kylin
  Issue Type: Bug
  Components: Real-time Streaming
Reporter: Chao Long
Assignee: Chao Long






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


??????????: kylin ??????????merge??????????

2019-01-31 Thread ????
2.2.0




--  --
??: "Na Zhai";
: 2019??1??31??(??) 2:54
??: "dev@kylin.apache.org";

: : kylin ??merge??



Hi, wanglin.

   What??s your Kylin version? There is an issue about auto merge: 
https://issues.apache.org/jira/browse/KYLIN-3718.

But I think your error is not related to that issue. It is may be caused by the 
deletion of cube_statistics directory.



?? Windows 10 ??




??:  <1059790...@qq.com>
: Monday, January 28, 2019 10:23:37 AM
??: dev
: kylin ??merge??

??
  ??kylin ??cube728
kylin 
??cube


??#2 Step Name: Merge Cuboid Statistics Duration:  
0.01 mins Waiting:  0 seconds
??java.lang.NullPointerException
at 
org.apache.kylin.engine.mr.steps.MergeStatisticsStep.doWork(MergeStatisticsStep.java:80)
  at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:125)
at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:64)
  at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:125)
at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:144)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
  at java.lang.Thread.run(Thread.java:745)