Efficient ways to parse xml from hive column(for selection/filters based on xml node values)

2011-12-23 Thread ravikumar visweswara
Hello All,

One of my hive columns has text data in xml format.  What are all the
efficient ways to parse the xml and query based on certain node values. Biz
User select/filter Query requirements are based 6 or 7 nodes in xml.  Is
there any built-in support or supporting libraries for this in HIVE?
I have used SerDe for unstructured log parsing, but wanted to check the
most efficient way without writing specific UDFS which can parse the xml.

Could some of you share your experiences and best practices?

Thanks and Regards
R


Re: Efficient ways to parse xml from hive column(for selection/filters based on xml node values)

2011-12-23 Thread Mark Grover
You might want to take a look at this:
https://cwiki.apache.org/Hive/languagemanual-xpathudf.html


Mark Grover, Business Intelligence Analyst
OANDA Corporation 

www: oanda.com www: fxtrade.com 
e: mgro...@oanda.com 

Best Trading Platform - World Finance's Forex Awards 2009. 
The One to Watch - Treasury Today's Adam Smith Awards 2009. 


- Original Message -
From: ravikumar visweswara talk2had...@gmail.com
To: user@hive.apache.org
Sent: Friday, December 23, 2011 10:35:59 AM
Subject: Efficient ways to parse xml from hive column(for selection/filters 
based on xml node values)

Hello All, 

One of my hive columns has text data in xml format. What are all the efficient 
ways to parse the xml and query based on certain node values. Biz User 
select/filter Query requirements are based 6 or 7 nodes in xml. Is there any 
built-in support or supporting libraries for this in HIVE? 
I have used SerDe for unstructured log parsing, but wanted to check the most 
efficient way without writing specific UDFS which can parse the xml. 

Could some of you share your experiences and best practices? 

Thanks and Regards 
R