I have a tuple
X = "contents of html file" like
X=(file:chararray)
X =
(<html><body><h2>hie</h2><h2>hie</h2><h2>hie</h2><h2>hie</h2>djfkdj<p>jhsdaj</p><h2>hie</h2></body></html>)

in

Y I have indices and tag name like
Y=
tag,start,end
(html,0,105)
(body,6,98)
(h2,12,24)
(h2,24,36)
(h2,36,48)
(h2,48,60)
(p,66,79)
(h2,79,91)

Z = FOREACH Y GENERATE udf(??); (what should be parameters to udf to send
X.file)
Now how do I store a tuple part of the string file from start index to end
index in some other alias say Z is my question

Join or Cross is not an option because I want to avoid redundant storage

Any alternate implementation or idea is welcomed

-- 
Vignesh Miriyala
http://web.iiit.ac.in/~miriyala
http://vigneshmiriyala.wordpress.com

Reply via email to