You can also use parse_url(url, 'HOST') instead of a regular expression.

On Thu, Mar 1, 2012 at 1:32 PM, Saurabh S <saurab...@live.com> wrote:

>  Of course, it works fine now. I feel like an idiot.
>
> And that problem using parse_url also went away and I can use that as well.
>
> Thanks a bunch, Phil.
>
> > Date: Thu, 1 Mar 2012 21:22:27 +0000
> > Subject: Re: Accessing elements from array returned by split() function
> > From: philip.j.trom...@gmail.com
> > To: user@hive.apache.org
>
> >
> > I guess that split(...)[1] is giving you what's inbetween the 1st and
> > 2nd '/' character, which is nothing. Try split(...)[2].
> >
> > Phil.
> >
> > On 1 March 2012 21:19, Saurabh S <saurab...@live.com> wrote:
> > > Hello,
> > >
> > > I have a set of URLs which I need to parse. For example, if the url is,
> > > http://www.google.com/anything/goes/here,
> > >
> > > I need to extract www.google.com, i.e. everything between second and
> third
> > > forward slashes.
> > >
> > > I can't figure out the regex pattern to do so, and am trying to use
> split()
> > > function instead. So, my hive query looks like
> > > select url, split(url,'/')
> > > ...
> > >
> > > The second column contains the entire array returned by the split
> function.
> > > Is there any way to access only the second element of the array, which
> will
> > > give me what I need?
> > >
> > > When I try the following statement select url, split(url,'/')[1], I
> get an
> > > empty second column.
> > >
> > > Is this the expected behavior? Any other suggestions on how to parse
> the
> > > URL?
> > >
> > > Oh by the way, I'm aware that the function parse_url(url,'HOST') will
> give
> > > me something similar to what I want, but for some reason, that
> function on
> > > my database is running extremely slow.
> > >
> > > First time posting to this list. If there is anything wrong, please
> let me
> > > know.
> > >
> > > Regards,
> > > Saurabh
> > >
>

Reply via email to