Re: Hive_CSV
The data is already in the csv so it is not matter for querying. It is recommend to convert it to ORC or Parquet for querying. > On 09 Mar 2016, at 19:09, Ajay Chander wrote: > > Daniel, thanks for your time. Is it like creating two tables, one is to get > all the data and the another one is to fetch the required data out of it? If > that is the case I was just concerned of redundant data. Please correct me if > I am wrong. Thanks > >> On Wednesday, March 9, 2016, Daniel Haviv >> wrote: >> Hi Ajay, >> Use the CSV serde to read your file, map all three columns but only select >> the relevant ones when you insert: >> >> Create table csvtab ( >> irrelevant string, >> sportName string, >> sportType string) ... >> >> Insert into loaded_table select sportName, sportType from csvtab; >> >> Daniel >> >> > On 9 Mar 2016, at 19:43, Ajay Chander wrote: >> > >> > Hi Everyone, >> > >> > I am looking for a way, to ignore the first occurrence of the delimiter >> > while loading the data from csv file to hive external table. >> > >> > Csv file: >> > >> > Xyz, baseball, outdoor >> > >> > Hive table has two columns sport_name & sport_type and fields are >> > separated by ',' >> > >> > Now I want to load by data into table such that while loading it has to >> > ignore the first delimiter that ignore xyz and load the data from second >> > delimiter. >> > >> > In the end my hive table should have the following data, >> > >> > Baseball, outdoor . >> > >> > Any inputs are appreciated. Thank you for your time.
Re: Hive_CSV
Daniel, thanks for your time. Is it like creating two tables, one is to get all the data and the another one is to fetch the required data out of it? If that is the case I was just concerned of redundant data. Please correct me if I am wrong. Thanks On Wednesday, March 9, 2016, Daniel Haviv wrote: > Hi Ajay, > Use the CSV serde to read your file, map all three columns but only select > the relevant ones when you insert: > > Create table csvtab ( > irrelevant string, > sportName string, > sportType string) ... > > Insert into loaded_table select sportName, sportType from csvtab; > > Daniel > > > On 9 Mar 2016, at 19:43, Ajay Chander > wrote: > > > > Hi Everyone, > > > > I am looking for a way, to ignore the first occurrence of the delimiter > while loading the data from csv file to hive external table. > > > > Csv file: > > > > Xyz, baseball, outdoor > > > > Hive table has two columns sport_name & sport_type and fields are > separated by ',' > > > > Now I want to load by data into table such that while loading it has to > ignore the first delimiter that ignore xyz and load the data from second > delimiter. > > > > In the end my hive table should have the following data, > > > > Baseball, outdoor . > > > > Any inputs are appreciated. Thank you for your time. >
Re: Hive_CSV
Jorn, thanks for your time. The reason I wanted to do so is, I don't want to bring the unnecessary data into the table. Each record is carrying a unnecessary value. On Wednesday, March 9, 2016, Jörn Franke wrote: > > Why Don't you load all data and use just two columns for querying? > Alternatively use regular expressions. > > > > > On 09 Mar 2016, at 18:43, Ajay Chander > wrote: > > > > Hi Everyone, > > > > I am looking for a way, to ignore the first occurrence of the delimiter > while loading the data from csv file to hive external table. > > > > Csv file: > > > > Xyz, baseball, outdoor > > > > Hive table has two columns sport_name & sport_type and fields are > separated by ',' > > > > Now I want to load by data into table such that while loading it has to > ignore the first delimiter that ignore xyz and load the data from second > delimiter. > > > > In the end my hive table should have the following data, > > > > Baseball, outdoor . > > > > Any inputs are appreciated. Thank you for your time. >
Re: Hive_CSV
Hi Ajay, Use the CSV serde to read your file, map all three columns but only select the relevant ones when you insert: Create table csvtab ( irrelevant string, sportName string, sportType string) ... Insert into loaded_table select sportName, sportType from csvtab; Daniel > On 9 Mar 2016, at 19:43, Ajay Chander wrote: > > Hi Everyone, > > I am looking for a way, to ignore the first occurrence of the delimiter while > loading the data from csv file to hive external table. > > Csv file: > > Xyz, baseball, outdoor > > Hive table has two columns sport_name & sport_type and fields are separated > by ',' > > Now I want to load by data into table such that while loading it has to > ignore the first delimiter that ignore xyz and load the data from second > delimiter. > > In the end my hive table should have the following data, > > Baseball, outdoor . > > Any inputs are appreciated. Thank you for your time.
Re: Hive_CSV
Why Don't you load all data and use just two columns for querying? Alternatively use regular expressions. > On 09 Mar 2016, at 18:43, Ajay Chander wrote: > > Hi Everyone, > > I am looking for a way, to ignore the first occurrence of the delimiter while > loading the data from csv file to hive external table. > > Csv file: > > Xyz, baseball, outdoor > > Hive table has two columns sport_name & sport_type and fields are separated > by ',' > > Now I want to load by data into table such that while loading it has to > ignore the first delimiter that ignore xyz and load the data from second > delimiter. > > In the end my hive table should have the following data, > > Baseball, outdoor . > > Any inputs are appreciated. Thank you for your time.