-----Original Message----- From: sw.l...@quandatics.com <sw.l...@quandatics.com> Sent: Friday, 12 November, 2021 5:57 PM To: user@nutch.apache.org Subject: RE: encrypt password of the index-writer.xml
Hi, Sebastian Thanks for your suggestion, may I know if there is a way to hide the password value in hadoop.log file table ? We also ran into an issue where an https connection could not be established with elasticsearch. Do you have any suggestions to solve this problem? Thank Best Regards, Shi Wei -----Original Message----- From: Sebastian Nagel <wastl.na...@googlemail.com.INVALID> Sent: Friday, 12 November, 2021 1:20 AM To: user@nutch.apache.org Subject: Re: encrypt password of the index-writer.xml Hi Shi Wei, there is a way, although definitely not the recommended one. Sorry, and it took me a little bit to proof it. Do you know about external XML entities or XXE attacks? 1. On top of the index-writers.xml you add an entity declaration: <?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE urlset [ <!ENTITY CREDENTIALS SYSTEM "file:///path/to/credentials.txt"> ]> 2. it's used later in the index writer spec: <writer id="indexer_solr_1" class="org.apache.nutch.indexwriter.solr.SolrIndexWriter"> <parameters> ... &CREDENTIALS; </parameters> 3. you add your credentials snippet to the file /path/to/credentials.txt <param name="username" value="username"/> <param name="password" value="SECRET"/> 4. and voila: $> bin/nutch index crawldb segment ... ├────────────┼─────────────────────────────┼─────────┤ │username │The username of Solr server. │username │ ├────────────┼─────────────────────────────┼─────────┤ │password │The password of Solr server. │SECRET │ └────────────┴─────────────────────────────┴─────────┘ Note: this is an dirty hack but not a security issue: with access to the index-writers.xml you can write anything into it. But there is no guarantee that this hack will continue to work in the future. Would you please be so kind to open a Jira issue to add real support for passwords in the index-writers.xml Best, Sebastian On 11/10/21 11:16, sw.l...@quandatics.com wrote: > Hi , > > > > > > We have tried the variable expansion method on the index-writers.xml, > it doesn't work. Could you advise if there are any alternative ways to > encrypt the password in the index-writers.xml file? > > > > > > Best Regards, > > Shi Wei > > > >