Re: Cant integrate the kerberos enabled solr cloud with nutch

Sebastian Nagel Fri, 22 Oct 2021 02:46:36 -0700

Hi Shi Wei,

could you also share the index writer configuration (conf/index-writers.xml)?


The default is unauthenticated access to Solr, see the snippet below.
The file httpclient-auth.xml is not relevant for the Solr indexer, it's
used if a crawled web site requires authentication in order to fetch
the content via the plugin protocol-httpclient.

Best,
Sebastian

  <writer id="indexer_solr_1" 
class="org.apache.nutch.indexwriter.solr.SolrIndexWriter">
    <parameters>
      <param name="type" value="http"/>
      <param name="url" value="http://localhost:8983/solr/nutch"/>
      <param name="collection" value=""/>
      <param name="weight.field" value=""/>
      <param name="commitSize" value="1000"/>
      <param name="auth" value="false"/>
      <param name="username" value="username"/>
      <param name="password" value="password"/>


On 10/22/21 10:10 AM, [email protected] wrote:

Hi,

We have encountered a problem which can’t integrate the kerberos enabled solr 
cloud with nutch.
When execute "nutch index crawl/crawldb/ -linkdb crawl/linkdb/ $s1 -filter -normalize" command ,it will fail with "HTTP ERROR 401Problemaccessing /solr/admin/collections. Reason:Authentication required" but we able to curl it with the keytab.
Version of Nutch :1.18

Your Sincerely,

Shi Wei

Re: Cant integrate the kerberos enabled solr cloud with nutch

Reply via email to