Hi,
Does Nutch 1.4 have support for POST based authentication that depends
on cookies?
If not, how do I work around sites that need this authentication.
Thanks,
--
View this message in context:
http://lucene.472066.n3.nabble.com/Http-Post-authentication-tp3993343.html
Sent from the Nutch
Thanks! That fixed my problem:)
--
View this message in context:
http://lucene.472066.n3.nabble.com/Nutch-1-4-with-Solr-3-6-compatible-tp3992890p3993327.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Sounds great, glad you got something.
Lewis
On Thu, Jul 5, 2012 at 6:04 PM, JAB wrote:
>
> Thanks for the advice. Currently I'm looking at a simplified GATE Gazetteer
> approach. My customer isn't clear on what he wants and the requirements I
> came up with may be overkill.
>
> --
> View this mes
How is your schema and accompanying solr-mapping.xml?
These need to be spot on or else you can expect sometimes confusing results.
hth
On Thu, Jul 5, 2012 at 7:58 PM, Jim Chandler wrote:
> Markus,
>
> Thanks for the speedy reply.
>
> I attempted your suggestion and my error changed now I'm gett
Markus,
Thanks for the speedy reply.
I attempted your suggestion and my error changed now I'm getting:
org.apache.solr.common.SolrException: [doc=null] missing required field: id
I am relatively new at this and all help is very appreciated.
Thanks
On Thu, Jul 5, 2012 at 11:41 AM, Markus Jelsm
Thanks for the advice. Currently I'm looking at a simplified GATE Gazetteer
approach. My customer isn't clear on what he wants and the requirements I
came up with may be overkill.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Re-Nutch-Author-Publication-and-Religion-Detect
Hello,
The index-more plugin might run after your custom plugin. You can configure the
order in which plugins are run. Please consult the indexingfilter.order
directive's description.in conf/nutch-default.xml.
Cheers,
-Original message-
> From:Jim Chandler
> Sent: Thu 05-Jul-2012
Greetings All,
I'm trying to write access NutchFields that have been written to the
NutchDocument earlier by the index-more plugin. When I use
NutchDocument.getFields() all that is returned is the segment and digest
fields. I know that index-more adds date, type, content-length. Could
someone p
URI consistency is not under our control. Perhaps we should attempt to identify
these pages first.
Thanks
-Original message-
> From:Lewis John Mcgibbney
> Sent: Thu 05-Jul-2012 10:56
> To: user@nutch.apache.org
> Subject: Re: Adaptive scheduling, but different
>
> Hi Markus,
> This
Hi Markus,
This is a tricky one, I have personally had terrible headaches with
similar problems where an update to a piece of legislation completely
changes it's URL, which makes the task of provenance hellishly
complex... We addressed this by ensuring that legislation URI's stay
consistent regardl
Any ideas?
-Original message-
> From:Markus Jelsma
> Sent: Mon 02-Jul-2012 23:05
> To: user@nutch.apache.org
> Subject: Adaptive scheduling, but different
>
> Hi,
>
> We use an adaptive scheduler for our crawl, this works fine for most cases
> but a specific type of page is crawled
11 matches
Mail list logo