Hi,

> Suppose IoTDB writes data into hdfs:///usr/iotdb/data/root.sg1
> Now there are two complete tsfiles, and a incomplete tsFile.
> If a user create a hive table referring to this folder, then how many
> tsfiles that Hive will observe?

It depends the implementation. When we create a hive table over a folder. We 
could get all files in that folder, including both complete and incomplete 
files. 
How to construct the table schema and how many data splits to generate depend 
ourselves. (In my opinion, it's better to ignore the incomplete files.)

> Then someone writes data into IoTDB, and finally there are 5 tsfiles, and a
> incomplete tsfile, then how many tsfiles that Hive will observe?

Once a table is created, whether the schema can be updated is beyond my 
knowledge (most probably it can not), maybe @Yuan Tian could give us an answer.

Best,
--
Jialin Qiao
School of Software, Tsinghua University

乔嘉林
清华大学 软件学院

> -----原始邮件-----
> 发件人: "Xiangdong Huang" <[email protected]>
> 发送时间: 2019-10-14 08:22:11 (星期一)
> 收件人: [email protected]
> 抄送: 
> 主题: Re: Questions about Hive Integration
> 
> Hi,
> 
> What is more important is the 3rd question...
> 
> Suppose IoTDB writes data into hdfs:///usr/iotdb/data/root.sg1
> Now there are two complete tsfiles, and a incomplete tsFile.
> 
> If a user create a hive table referring to this folder, then how many
> tsfiles that Hive will observe?
> 
> Then someone writes data into IoTDB, and finally there are 5 tsfiles, and a
> incomplete tsfile, then how many tsfiles that Hive will observe?
> 
> Best,
> -----------------------------------
> Xiangdong Huang
> School of Software, Tsinghua University
> 
>  黄向东
> 清华大学 软件学院
> 
> 
> Jialin Qiao <[email protected]> 于2019年10月14日周一 上午8:14写道:
> 
> > Hi,
> >
> > > 1. Hive 3.x needs hadoop 3.x, but our project is based on the hadoop
> > 2.x, and these two are incompatible.
> >
> > We could upgrade the version If users need 3.x. There isn't any particular
> > reason that we based on hadoop-2.x...
> >
> > Best,
> > --
> > Jialin Qiao
> > School of Software, Tsinghua University
> >
> > 乔嘉林
> > 清华大学 软件学院
> >
> > > -----原始邮件-----
> > > 发件人: "田原" <[email protected]>
> > > 发送时间: 2019-10-13 14:16:32 (星期日)
> > > 收件人: [email protected]
> > > 抄送:
> > > 主题: Re: Questions about Hive Integration
> > >
> > > Hi,
> > >
> > > 1. Hive 3.x needs hadoop 3.x, but our project is based on the hadoop
> > 2.x, and these two are incompatible.
> > >
> > > 2. Now, we don't support insert operation in hive. Only query operation
> > is supported.
> > >
> > > 3. If there are more than one tsfile in one folder, all the tsfiles in
> > that folder will be pre-read by InputFormat, and then filter them by the
> > table name(i.e. device_id) and the field (i.e. sensor_id). So the result
> > set returned to user will only contains data they want.
> > >
> > >
> > > > -----原始邮件-----
> > > > 发件人: "Xiangdong Huang" <[email protected]>
> > > > 发送时间: 2019-10-12 14:38:36 (星期六)
> > > > 收件人: [email protected]
> > > > 抄送:
> > > > 主题: Questions about Hive Integration
> > > >
> > > > Hi,
> > > >
> > > > Today I meets some users from Inspur (a famous Chinese company), and
> > they
> > > > address some requirements about IoTDB with Hive.
> > > >
> > > > 1. First, they use hadoop 3.x.
> > > >
> > > > 2. Can we use Hive to write data back to HDFS with TsFile format?
> > > > e.g., HiveQL: create table ..... LOCATION HDFS_FILE
> > > >
> > > > 3. If Hive user uses Load command from a IoTDB's data folder, what
> > happens?
> > > > E.g., HiveQL: load data inpath 'iotdb/data/root.sg1/' into table
> > > > tablename...
> > > > Then, which Table files will be observed by Hive? If more TsFiles in
> > the
> > > > folder are generated by IoTDB, can Hive observes them automatically?
> > > >
> > > > Best,
> > > > -----------------------------------
> > > > Xiangdong Huang
> > > > School of Software, Tsinghua University
> > > >
> > > >  黄向东
> > > > 清华大学 软件学院
> >

Reply via email to