we could create an account for the project at SO, give the user list as an
email address and set up an alert so that any question tagged as [nutch]
gets sent to user@nutch.apache.org
That should work shouldn't it?

On 12 February 2016 at 15:11, Mattmann, Chris A (3980) <
chris.a.mattm...@jpl.nasa.gov> wrote:

> That’s a cool idea but how would we set up the redirect since
> wouldn’t that have to occur at SO?
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattm...@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
> -----Original Message-----
> From: Julien Nioche <lists.digitalpeb...@gmail.com>
> Reply-To: "user@nutch.apache.org" <user@nutch.apache.org>
> Date: Wednesday, February 10, 2016 at 6:48 AM
> To: "user@nutch.apache.org" <user@nutch.apache.org>
> Subject: Re: [MASSMAIL]Extract Contact Information - Custom Parser
>
> >See SO =>
> >
> http://stackoverflow.com/questions/35299744/nutch-parser-plugin-collect-co
> >ntact-information
> >
> >There seems to be more and more people sending the questions to both the
> >ML
> >and SO. Am wondering whether we should set up a redirect so that any
> >question asked there lands automatically on the user list. Any thoughts?
> >
> >On 10 February 2016 at 14:43, Markus Jelsma <markus.jel...@openindex.io>
> >wrote:
> >
> >> Yes, i would also implement a HtmlParserFilter plugin but execute the
> >> regex on the parseText, because that is where you are going to find
> >>phone
> >> numbers etc.
> >> Markus
> >>
> >>
> >>
> >> -----Original message-----
> >> > From:Jorge Luis Betancourt González <jlbetanco...@uci.cu>
> >> > Sent: Tuesday 9th February 2016 19:59
> >> > To: user@nutch.apache.org
> >> > Subject: Re: [MASSMAIL]Extract Contact Information - Custom Parser
> >> >
> >> > Any particular requiremente that prevent you from implementing your
> >> logic as a HtmlParser plugin? essentially the parsing will be done for
> >>you
> >> (by parse-html or parse-tika) and all you need to do is find the right
> >> nodes and extract the desired information (see [1]).
> >> >
> >> > Regards,
> >> >
> >> > [1] http://svn.apache.org/repos/asf/nutch/trunk/src/plugin/headings/
> >> >
> >> > ----- Mensaje original -----
> >> > De: "Bin Wang" <binwang...@gmail.com>
> >> > Para: "Apache.Nutch.User" <user@nutch.apache.org>
> >> > Enviados: Martes, 9 de Febrero 2016 13:19:35
> >> > Asunto: [MASSMAIL]Extract Contact Information - Custom Parser
> >> >
> >> > Hi there,
> >> >
> >> > I am working on a project that need to identify contact points on
> >> company's
> >> > website and used for the purpose of enhancing security.
> >> >
> >> > Right now, I managed to crawl several rounds of sites. The next step
> >>will
> >> > be to parse the HTML pages and locate where the contact information
> >>is.
> >> In
> >> > this case, I am only interested in email addresses and phone
> >>numbers....
> >> >
> >> > Here is what I am planning to do, we can write a map reduce jobs to
> >>parse
> >> > HTML file and use things like regular expression in combo with
> >> > Jsoup/Beautifulsoup HTML parsers to find the regular expression.
> >> >
> >> > However, I am wondering is there any parser plugin that has already
> >>been
> >> > implemented and maybe tested used for this purpose?
> >> >
> >> > Also, any feedback how to achieve this is much appreciated!
> >> >
> >> > Best regards,
> >> >
> >> > Bin
> >> >
> >>
> >
> >
> >
> >--
> >
> >*Open Source Solutions for Text Engineering*
> >
> >http://www.digitalpebble.com
> >http://digitalpebble.blogspot.com/
> >#digitalpebble <http://twitter.com/digitalpebble>
>
>


-- 

*Open Source Solutions for Text Engineering*

http://www.digitalpebble.com
http://digitalpebble.blogspot.com/
#digitalpebble <http://twitter.com/digitalpebble>

Reply via email to