Re: Auto Refresh Hive Table Metadata
> By the way, if you want near-real-time tables with Hive, maybe you should > have a look at this project from Uber: https://uber.github.io/hudi/ > I don't know how mature it is yet, but I think it aims at solving that kind > of challenge. Depending on your hive setup, you don't need a different backend to do near-real-time tables. https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest Prasanth has a benchmark for Hive 3.x, which is limited by HDFS bandwidth at the moment with 64 threads. https://github.com/prasanthj/culvert $ ./culvert -u thrift://localhost:9183 -db testing -table culvert -p 64 -n 10 Total rows committed: 9210 Throughput: 1535000 rows/second Cheers, Gopal
Re: Auto Refresh Hive Table Metadata
Hi Chintan, Yes, this sounds weird... "REFRESH TABLES" is the kind of statement required by SQL engines such as Impala, Presto or Spark-SQL that cache metadata from the Metastore, but vanilla Hive usually don't cache it and query the metastore every time (unless some new feature was added recently, in which case it is probably be possible to disable it with some option). In other words, as long as you add new files to your existing partitions, they should be automatically readable from Hive. If you add new partitions, that's a different story, of course. Are you sure you are using Hive here, and not Spark-SQL or something else? By the way, if you want near-real-time tables with Hive, maybe you should have a look at this project from Uber: https://uber.github.io/hudi/ I don't know how mature it is yet, but I think it aims at solving that kind of challenge. Regards, Furcy On Thu, 9 Aug 2018 at 18:30, Will Du wrote: > i never experienced such kind of issue. Once data is loaded to HDFS by > sink, the data is available in hive. > > Sent from my iPhone > > On Aug 9, 2018, at 10:18, Chintan Patel wrote: > > Hello Will Du, > > I'm using Kafka connector to create hive database. All the data are stored > in s3 bucket and using mysql database for metastore. > > For example If connector add new records in hive table and If I run query > It's not returning latest data and I have to run refresh table {table_name} > to clear metastore cache. Now If I have 1000 hive table and I want to > update those tables every 5 mins, running refresh query is not good idea I > guess. > > So I was thinking if hive has some type of mechanism to do it in > background then it will be good. > > > On 9 August 2018 at 17:51, Will Du wrote: > >> any reason to do this? >> >> Sent from my iPhone >> >> > On Aug 9, 2018, at 07:57, Chintan Patel wrote: >> > >> > Hello, >> > >> > I want to refresh external type hive table metadata on some regular >> interval without using "refresh table {table_name}". >> > >> > Thanks & Regards >> > >> > >
Re: Auto Refresh Hive Table Metadata
i never experienced such kind of issue. Once data is loaded to HDFS by sink, the data is available in hive. Sent from my iPhone > On Aug 9, 2018, at 10:18, Chintan Patel wrote: > > Hello Will Du, > > I'm using Kafka connector to create hive database. All the data are stored in > s3 bucket and using mysql database for metastore. > > For example If connector add new records in hive table and If I run query > It's not returning latest data and I have to run refresh table {table_name} > to clear metastore cache. Now If I have 1000 hive table and I want to update > those tables every 5 mins, running refresh query is not good idea I guess. > > So I was thinking if hive has some type of mechanism to do it in background > then it will be good. > > >> On 9 August 2018 at 17:51, Will Du wrote: >> any reason to do this? >> >> Sent from my iPhone >> >> > On Aug 9, 2018, at 07:57, Chintan Patel wrote: >> > >> > Hello, >> > >> > I want to refresh external type hive table metadata on some regular >> > interval without using "refresh table {table_name}". >> > >> > Thanks & Regards >> > >
Re: Auto Refresh Hive Table Metadata
Hello Will Du, I'm using Kafka connector to create hive database. All the data are stored in s3 bucket and using mysql database for metastore. For example If connector add new records in hive table and If I run query It's not returning latest data and I have to run refresh table {table_name} to clear metastore cache. Now If I have 1000 hive table and I want to update those tables every 5 mins, running refresh query is not good idea I guess. So I was thinking if hive has some type of mechanism to do it in background then it will be good. On 9 August 2018 at 17:51, Will Du wrote: > any reason to do this? > > Sent from my iPhone > > > On Aug 9, 2018, at 07:57, Chintan Patel wrote: > > > > Hello, > > > > I want to refresh external type hive table metadata on some regular > interval without using "refresh table {table_name}". > > > > Thanks & Regards > > >
Re: Auto Refresh Hive Table Metadata
any reason to do this? Sent from my iPhone > On Aug 9, 2018, at 07:57, Chintan Patel wrote: > > Hello, > > I want to refresh external type hive table metadata on some regular interval > without using "refresh table {table_name}". > > Thanks & Regards >
Auto Refresh Hive Table Metadata
Hello, I want to refresh external type hive table metadata on some regular interval without using "refresh table {table_name}". Thanks & Regards