Re: Multi character delimiter for Hive Columns and Rows
A custom SerDe would be your best bet. We're using one to do exactly that. Regards, Rick On Apr 28, 2011 11:29 AM, "Shantian Purkad" wrote: > Any suggestions? > > > > > From: Shantian Purkad > To: user@hive.apache.org > Sent: Tue, April 26, 2011 11:05:46 PM > Subject: Multi character delimiter for Hive Columns and Rows > > > Hello, > > We have a situation where the data coming from source systems to hive may > contain the common characters and delimiters such as |, tabs, new line > characters etc. > > We may have to use multi character delimiters such as "|#" for columns and > "||#" for rows. > > How can we achieve this? In this case our single rows may look like below (|#is > column delimiter and ||#is row delimiter > > row 1 col1 |# row 1 col2 |# row 1 col 3 has > two > new line characters |# and this is > the last column of row 1 ||# row 2 col1 |# row 2 col2 |# row 2 col 3 has > one tab and one new line character |# and this is > the last column of row 2 ||# > > Would custom SerDe help us handle this situation? > > Thanks and Regards, > Shantian
Re: Multi character delimiter for Hive Columns and Rows
Any suggestions? From: Shantian Purkad To: user@hive.apache.org Sent: Tue, April 26, 2011 11:05:46 PM Subject: Multi character delimiter for Hive Columns and Rows Hello, We have a situation where the data coming from source systems to hive may contain the common characters and delimiters such as |, tabs, new line characters etc. We may have to use multi character delimiters such as "|#" for columns and "||#" for rows. How can we achieve this? In this case our single rows may look like below (|#is column delimiter and ||#is row delimiter row 1 col1 |# row 1 col2 |# row 1 col 3 has two new line characters |# and this is the last column of row 1 ||# row 2 col1 |# row 2 col2 |# row 2 col 3 has one tab and one new line character |# and this is the last column of row 2 ||# Would custom SerDe help us handle this situation? Thanks and Regards, Shantian
Multi character delimiter for Hive Columns and Rows
Hello, We have a situation where the data coming from source systems to hive may contain the common characters and delimiters such as |, tabs, new line characters etc. We may have to use multi character delimiters such as "|#" for columns and "||#" for rows. How can we achieve this? In this case our single rows may look like below (|#is column delimiter and ||#is row delimiter row 1 col1 |# row 1 col2 |# row 1 col 3 has two new line characters |# and this is the last column of row 1 ||# row 2 col1 |# row 2 col2 |# row 2 col 3 has one tab and one new line character |# and this is the last column of row 2 ||# Would custom SerDe help us handle this situation? Thanks and Regards, Shantian