Re: Re: Questions about Hive Integration

田原 Sun, 13 Oct 2019 18:47:31 -0700

Hi,

I do a test, at first there are two tsfiles in the folder, when I do a 
time-consuming query operation, I add another new tsfile in that folder 
simultaneously. In the ongoing query operation, the new tsfile can't be 
observed. But then I do another query operation, it can be observed.


The incomplete files can be filtered in the InputFormat, but now I just take 
them all into account, thought that the files in that specific holder are all 
complete.


> -----原始邮件-----
> 发件人: "Xiangdong Huang" <[email protected]>
> 发送时间: 2019-10-14 08:22:11 (星期一)
> 收件人: [email protected]
> 抄送: 
> 主题: Re: Questions about Hive Integration
> 
> Hi,
> 
> What is more important is the 3rd question...
> 
> Suppose IoTDB writes data into hdfs:///usr/iotdb/data/root.sg1
> Now there are two complete tsfiles, and a incomplete tsFile.
> 
> If a user create a hive table referring to this folder, then how many
> tsfiles that Hive will observe?
> 
> Then someone writes data into IoTDB, and finally there are 5 tsfiles, and a
> incomplete tsfile, then how many tsfiles that Hive will observe?
> 
> Best,
> -----------------------------------
> Xiangdong Huang
> School of Software, Tsinghua University
> 
>  黄向东
> 清华大学 软件学院
> 
> 
> Jialin Qiao <[email protected]> 于2019年10月14日周一 上午8:14写道：
> 
> > Hi,
> >
> > > 1. Hive 3.x needs hadoop 3.x, but our project is based on the hadoop
> > 2.x, and these two are incompatible.
> >
> > We could upgrade the version If users need 3.x. There isn't any particular
> > reason that we based on hadoop-2.x...
> >
> > Best,
> > --
> > Jialin Qiao
> > School of Software, Tsinghua University
> >
> > 乔嘉林
> > 清华大学 软件学院
> >
> > > -----原始邮件-----
> > > 发件人: "田原" <[email protected]>
> > > 发送时间: 2019-10-13 14:16:32 (星期日)
> > > 收件人: [email protected]
> > > 抄送:
> > > 主题: Re: Questions about Hive Integration
> > >
> > > Hi,
> > >
> > > 1. Hive 3.x needs hadoop 3.x, but our project is based on the hadoop
> > 2.x, and these two are incompatible.
> > >
> > > 2. Now, we don't support insert operation in hive. Only query operation
> > is supported.
> > >
> > > 3. If there are more than one tsfile in one folder, all the tsfiles in
> > that folder will be pre-read by InputFormat, and then filter them by the
> > table name(i.e. device_id) and the field (i.e. sensor_id). So the result
> > set returned to user will only contains data they want.
> > >
> > >
> > > > -----原始邮件-----
> > > > 发件人: "Xiangdong Huang" <[email protected]>
> > > > 发送时间: 2019-10-12 14:38:36 (星期六)
> > > > 收件人: [email protected]
> > > > 抄送:
> > > > 主题: Questions about Hive Integration
> > > >
> > > > Hi,
> > > >
> > > > Today I meets some users from Inspur (a famous Chinese company), and
> > they
> > > > address some requirements about IoTDB with Hive.
> > > >
> > > > 1. First, they use hadoop 3.x.
> > > >
> > > > 2. Can we use Hive to write data back to HDFS with TsFile format?
> > > > e.g., HiveQL: create table ..... LOCATION HDFS_FILE
> > > >
> > > > 3. If Hive user uses Load command from a IoTDB's data folder, what
> > happens?
> > > > E.g., HiveQL: load data inpath 'iotdb/data/root.sg1/' into table
> > > > tablename...
> > > > Then, which Table files will be observed by Hive? If more TsFiles in
> > the
> > > > folder are generated by IoTDB, can Hive observes them automatically?
> > > >
> > > > Best,
> > > > -----------------------------------
> > > > Xiangdong Huang
> > > > School of Software, Tsinghua University
> > > >
> > > >  黄向东
> > > > 清华大学 软件学院
> >

Re: Re: Questions about Hive Integration

Reply via email to