Hi, Sebastian

Thanks for your suggestion, may I know if there is a way to hide the password 
value in hadoop.log file table ?
We also ran into an issue where an https connection could not be established 
with elasticsearch. Do you have any suggestions to solve this problem?
Thank 



Best Regards,
 Shi Wei

-----Original Message-----
From: Sebastian Nagel <wastl.na...@googlemail.com.INVALID> 
Sent: Friday, 12 November, 2021 1:20 AM
To: user@nutch.apache.org
Subject: Re: encrypt password of the index-writer.xml

Hi Shi Wei,

there is a way, although definitely not the recommended one.
Sorry, and it took me a little bit to proof it.

Do you know about external XML entities or XXE attacks?

1. On top of the index-writers.xml you add an entity declaration:

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE urlset [
  <!ENTITY CREDENTIALS SYSTEM "file:///path/to/credentials.txt">
]>


2. it's used later in the index writer spec:

  <writer id="indexer_solr_1"
          class="org.apache.nutch.indexwriter.solr.SolrIndexWriter">
    <parameters>
      ...
      &CREDENTIALS;
    </parameters>

3. you add your credentials snippet to the file /path/to/credentials.txt

<param name="username" value="username"/> <param name="password" 
value="SECRET"/>

4. and voila:

$> bin/nutch index crawldb segment
...
├────────────┼─────────────────────────────┼─────────┤
│username    │The username of Solr server. │username │
├────────────┼─────────────────────────────┼─────────┤
│password    │The password of Solr server. │SECRET   │
└────────────┴─────────────────────────────┴─────────┘


Note: this is an dirty hack but not a security issue: with access to the 
index-writers.xml you can write anything into it.  But there is no guarantee 
that this hack will continue to work in the future.

Would you please be so kind to open a Jira issue to add real support for 
passwords in the index-writers.xml

Best,
Sebastian



On 11/10/21 11:16, sw.l...@quandatics.com wrote:
> Hi ,
> 
>  
> 
>  
> 
> We have tried the variable expansion method on the index-writers.xml, 
> it doesn't work. Could you advise if there are any alternative ways to 
> encrypt the password in the index-writers.xml file?
> 
>  
> 
>  
> 
> Best Regards,
> 
> Shi Wei
> 
>  
> 
> 

Reply via email to