Re: Replication master+slave

2009-05-15 Thread Michael Ludwig

Bryan Talbot schrieb:

So how are people managing solrconfig.xml files which are largely the
same other than differences for replication?

I don't think it's a "good thing" to maintain two copies of the same
file and I'd like to avoid that.  Maybe enabling the XInclude feature
in DocumentBuilders would make it possible to modularize configuration
files to make this possible?


This is already possible using the XML feature called "entities",
more precisely "external general parsed entities" (EGPE). I've never
seen a parser that doesn't do entities.

C:\MILU\dev\XML # type egpe-net.xml
http://lobster.as-guides.com/ds/solr.schema.ent"; >

]>

&egpe_from_the_net;
&egpe_from_the_local_disk;


C:\MILU\dev\XML # type egpe-local.ent




Michael Ludwig


Re: Replication master+slave

2009-05-14 Thread Bryan Talbot

https://issues.apache.org/jira/browse/SOLR-1167



-Bryan




On May 13, 2009, at May 13, 7:20 PM, Otis Gospodnetic wrote:



Bryan, maybe it's time to stick this in JIRA?
http://wiki.apache.org/solr/HowToContribute

Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: Bryan Talbot 
To: solr-user@lucene.apache.org
Sent: Wednesday, May 13, 2009 10:11:21 PM
Subject: Re: Replication master+slave

I think the patch I included earlier covers solr core, but it looks  
like at
least some other extensions (DIH) create and use their own XML  
parser.  So, if
this functionality is to extend to all XML files, those will need  
similar

patches.

Here's one for DIH:

--- src/main/java/org/apache/solr/handler/dataimport/ 
DataImporter.java

(revision 774137)
+++ src/main/java/org/apache/solr/handler/dataimport/ 
DataImporter.java  (working

copy)
@@ -148,8 +148,10 @@
  void loadDataConfig(String configFile) {

try {
-  DocumentBuilder builder = DocumentBuilderFactory.newInstance()
-  .newDocumentBuilder();
+  DocumentBuilderFactory dbf =  
DocumentBuilderFactory.newInstance();

+  dbf.setNamespaceAware(true);
+  dbf.setXIncludeAware(true);
+  DocumentBuilder builder = dbf.newDocumentBuilder();
  Document document = builder.parse(new InputSource(new  
StringReader(

  configFile)));



The only down side I can see to this is it doesn't offer very  
expressive
conditional inclusion: the file is included if it's present  
otherwise fallback
inclusions can be used.  It's also specific to XML files and  
obviously won't
work for other types of configuration files.  However, it is simple  
and

effective.


-Bryan




On May 13, 2009, at May 13, 6:36 PM, Otis Gospodnetic wrote:



Coincidentally, from
http://www.cloudera.com/blog/2009/05/07/what%E2%80%99s-new-in-hadoop-core-020/ 
 :


"Hadoop configuration files now support XInclude elements for  
including
portions of another configuration file (HADOOP-4944). This  
mechanism allows you

to make configuration files more modular and reusable."


So "others are doing it, too".

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: Bryan Talbot
To: solr-user@lucene.apache.org
Sent: Wednesday, May 13, 2009 11:26:41 AM
Subject: Re: Replication master+slave

I see that Nobel's final comment in SOLR-1154 is that config  
files need to be
able to include snippets from external files.  In my limited  
testing, a

simple

patch to enable XInclude support seems to work.



--- src/java/org/apache/solr/core/Config.java   (revision 774137)
+++ src/java/org/apache/solr/core/Config.java   (working copy)
@@ -100,8 +100,10 @@
if (lis == null) {
  lis = loader.openConfig(name);
}
-  javax.xml.parsers.DocumentBuilder builder =
DocumentBuilderFactory.newInstance().newDocumentBuilder();
-  doc = builder.parse(lis);
+  javax.xml.parsers.DocumentBuilderFactory dbf =
DocumentBuilderFactory.newInstance();
+  dbf.setNamespaceAware(true);
+  dbf.setXIncludeAware(true);
+  doc = dbf.newDocumentBuilder().parse(lis);

  DOMUtil.substituteProperties(doc, loader.getCoreProperties());
} catch (ParserConfigurationException e)  {



This allows a clause like this to include the contents of  
replication.xml if

it

exists.  If it's not found an exception will be thrown.



href="http://localhost:8983/solr/corename/admin/file/?file=replication.xml 
"

   xmlns:xi="http://www.w3.org/2001/XInclude";>



If the file is optional and no exception should be thrown if the  
file is
missing, simply include a fallback action: in this case the  
fallback is empty

and does nothing.



href="http://localhost:8983/solr/forum_en/admin/file/?file=replication.xml 
"

   xmlns:xi="http://www.w3.org/2001/XInclude";>




-Bryan




On May 12, 2009, at May 12, 8:05 PM, Jian Han Guo wrote:

I was looking at the same problem, and had a discussion with  
Noble. You can

use a hack to achieve what you want, see

https://issues.apache.org/jira/browse/SOLR-1154

Thanks,

Jianhan


On Tue, May 12, 2009 at 5:13 PM, Bryan Talbot wrote:

So how are people managing solrconfig.xml files which are  
largely the same

other than differences for replication?

I don't think it's a "good thing" to maintain two copies of the  
same file
and I'd like to avoid that.  Maybe enabling the XInclude  
feature in
DocumentBuilders would make it possible to modularize  
configuration files

to

make this possible?






http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/parsers/DocumentBuilderFactory.html#setXIncludeAware(boolean)



-Bryan





On May 12, 2009, at May 12, 11:43 AM, Shalin Shekhar Mangar  
wrote:


On Tue, May 12, 2009 at 10:42 PM, Bryan Talbot

wrote:


For replication in 1.4, the wiki at
http://wiki.apache.org/solr/

Re: Replication master+slave

2009-05-13 Thread Shalin Shekhar Mangar
There's a related issue open.

https://issues.apache.org/jira/browse/SOLR-712

On Thu, May 14, 2009 at 7:50 AM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

>
> Bryan, maybe it's time to stick this in JIRA?
> http://wiki.apache.org/solr/HowToContribute
>
> Thanks,
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> - Original Message 
> > From: Bryan Talbot 
> > To: solr-user@lucene.apache.org
> > Sent: Wednesday, May 13, 2009 10:11:21 PM
> > Subject: Re: Replication master+slave
> >
> > I think the patch I included earlier covers solr core, but it looks like
> at
> > least some other extensions (DIH) create and use their own XML parser.
>  So, if
> > this functionality is to extend to all XML files, those will need similar
> > patches.
> >
> > Here's one for DIH:
> >
> > --- src/main/java/org/apache/solr/handler/dataimport/DataImporter.java
> > (revision 774137)
> > +++ src/main/java/org/apache/solr/handler/dataimport/DataImporter.java
>  (working
> > copy)
> > @@ -148,8 +148,10 @@
> >void loadDataConfig(String configFile) {
> >
> >  try {
> > -  DocumentBuilder builder = DocumentBuilderFactory.newInstance()
> > -  .newDocumentBuilder();
> > +  DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
> > +  dbf.setNamespaceAware(true);
> > +  dbf.setXIncludeAware(true);
> > +  DocumentBuilder builder = dbf.newDocumentBuilder();
> >Document document = builder.parse(new InputSource(new
> StringReader(
> >configFile)));
> >
> >
> >
> > The only down side I can see to this is it doesn't offer very expressive
> > conditional inclusion: the file is included if it's present otherwise
> fallback
> > inclusions can be used.  It's also specific to XML files and obviously
> won't
> > work for other types of configuration files.  However, it is simple and
> > effective.
> >
> >
> > -Bryan
> >
> >
> >
> >
> > On May 13, 2009, at May 13, 6:36 PM, Otis Gospodnetic wrote:
> >
> > >
> > > Coincidentally, from
> >
> http://www.cloudera.com/blog/2009/05/07/what%E2%80%99s-new-in-hadoop-core-020/:
> > >
> > > "Hadoop configuration files now support XInclude elements for including
> > portions of another configuration file (HADOOP-4944). This mechanism
> allows you
> > to make configuration files more modular and reusable."
> > >
> > > So "others are doing it, too".
> > >
> > > Otis
> > > --
> > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > >
> > >
> > >
> > > - Original Message 
> > >> From: Bryan Talbot
> > >> To: solr-user@lucene.apache.org
> > >> Sent: Wednesday, May 13, 2009 11:26:41 AM
> > >> Subject: Re: Replication master+slave
> > >>
> > >> I see that Nobel's final comment in SOLR-1154 is that config files
> need to be
> > >> able to include snippets from external files.  In my limited testing,
> a
> > simple
> > >> patch to enable XInclude support seems to work.
> > >>
> > >>
> > >>
> > >> --- src/java/org/apache/solr/core/Config.java   (revision 774137)
> > >> +++ src/java/org/apache/solr/core/Config.java   (working copy)
> > >> @@ -100,8 +100,10 @@
> > >>  if (lis == null) {
> > >>lis = loader.openConfig(name);
> > >>  }
> > >> -  javax.xml.parsers.DocumentBuilder builder =
> > >> DocumentBuilderFactory.newInstance().newDocumentBuilder();
> > >> -  doc = builder.parse(lis);
> > >> +  javax.xml.parsers.DocumentBuilderFactory dbf =
> > >> DocumentBuilderFactory.newInstance();
> > >> +  dbf.setNamespaceAware(true);
> > >> +  dbf.setXIncludeAware(true);
> > >> +  doc = dbf.newDocumentBuilder().parse(lis);
> > >>
> > >>DOMUtil.substituteProperties(doc, loader.getCoreProperties());
> > >> } catch (ParserConfigurationException e)  {
> > >>
> > >>
> > >>
> > >> This allows a clause like this to include the contents of
> replication.xml if
> > it
> > >> exists.  If it's not found an exception will be thrown.
> > >>
> > >>
> > >>
> > >> href="
> http

Re: Replication master+slave

2009-05-13 Thread Otis Gospodnetic

Bryan, maybe it's time to stick this in JIRA?
http://wiki.apache.org/solr/HowToContribute

Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Bryan Talbot 
> To: solr-user@lucene.apache.org
> Sent: Wednesday, May 13, 2009 10:11:21 PM
> Subject: Re: Replication master+slave
> 
> I think the patch I included earlier covers solr core, but it looks like at 
> least some other extensions (DIH) create and use their own XML parser.  So, 
> if 
> this functionality is to extend to all XML files, those will need similar 
> patches.
> 
> Here's one for DIH:
> 
> --- src/main/java/org/apache/solr/handler/dataimport/DataImporter.java  
> (revision 774137)
> +++ src/main/java/org/apache/solr/handler/dataimport/DataImporter.java  
> (working 
> copy)
> @@ -148,8 +148,10 @@
>void loadDataConfig(String configFile) {
> 
>  try {
> -  DocumentBuilder builder = DocumentBuilderFactory.newInstance()
> -  .newDocumentBuilder();
> +  DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
> +  dbf.setNamespaceAware(true);
> +  dbf.setXIncludeAware(true);
> +  DocumentBuilder builder = dbf.newDocumentBuilder();
>Document document = builder.parse(new InputSource(new StringReader(
>configFile)));
> 
> 
> 
> The only down side I can see to this is it doesn't offer very expressive 
> conditional inclusion: the file is included if it's present otherwise 
> fallback 
> inclusions can be used.  It's also specific to XML files and obviously won't 
> work for other types of configuration files.  However, it is simple and 
> effective.
> 
> 
> -Bryan
> 
> 
> 
> 
> On May 13, 2009, at May 13, 6:36 PM, Otis Gospodnetic wrote:
> 
> > 
> > Coincidentally, from 
> http://www.cloudera.com/blog/2009/05/07/what%E2%80%99s-new-in-hadoop-core-020/
>  :
> > 
> > "Hadoop configuration files now support XInclude elements for including 
> portions of another configuration file (HADOOP-4944). This mechanism allows 
> you 
> to make configuration files more modular and reusable."
> > 
> > So "others are doing it, too".
> > 
> > Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > 
> > 
> > 
> > - Original Message 
> >> From: Bryan Talbot 
> >> To: solr-user@lucene.apache.org
> >> Sent: Wednesday, May 13, 2009 11:26:41 AM
> >> Subject: Re: Replication master+slave
> >> 
> >> I see that Nobel's final comment in SOLR-1154 is that config files need to 
> >> be
> >> able to include snippets from external files.  In my limited testing, a 
> simple
> >> patch to enable XInclude support seems to work.
> >> 
> >> 
> >> 
> >> --- src/java/org/apache/solr/core/Config.java   (revision 774137)
> >> +++ src/java/org/apache/solr/core/Config.java   (working copy)
> >> @@ -100,8 +100,10 @@
> >>  if (lis == null) {
> >>lis = loader.openConfig(name);
> >>  }
> >> -  javax.xml.parsers.DocumentBuilder builder =
> >> DocumentBuilderFactory.newInstance().newDocumentBuilder();
> >> -  doc = builder.parse(lis);
> >> +  javax.xml.parsers.DocumentBuilderFactory dbf =
> >> DocumentBuilderFactory.newInstance();
> >> +  dbf.setNamespaceAware(true);
> >> +  dbf.setXIncludeAware(true);
> >> +  doc = dbf.newDocumentBuilder().parse(lis);
> >> 
> >>DOMUtil.substituteProperties(doc, loader.getCoreProperties());
> >> } catch (ParserConfigurationException e)  {
> >> 
> >> 
> >> 
> >> This allows a clause like this to include the contents of replication.xml 
> >> if 
> it
> >> exists.  If it's not found an exception will be thrown.
> >> 
> >> 
> >> 
> >> href="http://localhost:8983/solr/corename/admin/file/?file=replication.xml";
> >> xmlns:xi="http://www.w3.org/2001/XInclude";>
> >> 
> >> 
> >> 
> >> If the file is optional and no exception should be thrown if the file is
> >> missing, simply include a fallback action: in this case the fallback is 
> >> empty
> >> and does nothing.
> >> 
> >> 
> >> 
> >> href="http://localhost:8983/solr/forum_en/admin/file/?file=replication.xml";
> >> xmlns:xi="http://www.w3.org/2001/XInclude";>
> >> 
> >> 
> &

Re: Replication master+slave

2009-05-13 Thread Bryan Talbot
I think the patch I included earlier covers solr core, but it looks  
like at least some other extensions (DIH) create and use their own XML  
parser.  So, if this functionality is to extend to all XML files,  
those will need similar patches.


Here's one for DIH:

--- src/main/java/org/apache/solr/handler/dataimport/ 
DataImporter.java  (revision 774137)
+++ src/main/java/org/apache/solr/handler/dataimport/ 
DataImporter.java  (working copy)

@@ -148,8 +148,10 @@
   void loadDataConfig(String configFile) {

 try {
-  DocumentBuilder builder = DocumentBuilderFactory.newInstance()
-  .newDocumentBuilder();
+  DocumentBuilderFactory dbf =  
DocumentBuilderFactory.newInstance();

+  dbf.setNamespaceAware(true);
+  dbf.setXIncludeAware(true);
+  DocumentBuilder builder = dbf.newDocumentBuilder();
   Document document = builder.parse(new InputSource(new  
StringReader(

   configFile)));



The only down side I can see to this is it doesn't offer very  
expressive conditional inclusion: the file is included if it's present  
otherwise fallback inclusions can be used.  It's also specific to XML  
files and obviously won't work for other types of configuration  
files.  However, it is simple and effective.



-Bryan




On May 13, 2009, at May 13, 6:36 PM, Otis Gospodnetic wrote:



Coincidentally, from http://www.cloudera.com/blog/2009/05/07/what%E2%80%99s-new-in-hadoop-core-020/ 
 :


"Hadoop configuration files now support XInclude elements for  
including portions of another configuration file (HADOOP-4944). This  
mechanism allows you to make configuration files more modular and  
reusable."


So "others are doing it, too".

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: Bryan Talbot 
To: solr-user@lucene.apache.org
Sent: Wednesday, May 13, 2009 11:26:41 AM
Subject: Re: Replication master+slave

I see that Nobel's final comment in SOLR-1154 is that config files  
need to be
able to include snippets from external files.  In my limited  
testing, a simple

patch to enable XInclude support seems to work.



--- src/java/org/apache/solr/core/Config.java   (revision 774137)
+++ src/java/org/apache/solr/core/Config.java   (working copy)
@@ -100,8 +100,10 @@
 if (lis == null) {
   lis = loader.openConfig(name);
 }
-  javax.xml.parsers.DocumentBuilder builder =
DocumentBuilderFactory.newInstance().newDocumentBuilder();
-  doc = builder.parse(lis);
+  javax.xml.parsers.DocumentBuilderFactory dbf =
DocumentBuilderFactory.newInstance();
+  dbf.setNamespaceAware(true);
+  dbf.setXIncludeAware(true);
+  doc = dbf.newDocumentBuilder().parse(lis);

   DOMUtil.substituteProperties(doc, loader.getCoreProperties());
} catch (ParserConfigurationException e)  {



This allows a clause like this to include the contents of  
replication.xml if it

exists.  If it's not found an exception will be thrown.



href="http://localhost:8983/solr/corename/admin/file/?file=replication.xml 
"

xmlns:xi="http://www.w3.org/2001/XInclude";>



If the file is optional and no exception should be thrown if the  
file is
missing, simply include a fallback action: in this case the  
fallback is empty

and does nothing.



href="http://localhost:8983/solr/forum_en/admin/file/?file=replication.xml 
"

xmlns:xi="http://www.w3.org/2001/XInclude";>




-Bryan




On May 12, 2009, at May 12, 8:05 PM, Jian Han Guo wrote:

I was looking at the same problem, and had a discussion with  
Noble. You can

use a hack to achieve what you want, see

https://issues.apache.org/jira/browse/SOLR-1154

Thanks,

Jianhan


On Tue, May 12, 2009 at 5:13 PM, Bryan Talbot wrote:

So how are people managing solrconfig.xml files which are largely  
the same

other than differences for replication?

I don't think it's a "good thing" to maintain two copies of the  
same file

and I'd like to avoid that.  Maybe enabling the XInclude feature in
DocumentBuilders would make it possible to modularize  
configuration files to

make this possible?




http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/parsers/DocumentBuilderFactory.html#setXIncludeAware(boolean)



-Bryan





On May 12, 2009, at May 12, 11:43 AM, Shalin Shekhar Mangar wrote:

On Tue, May 12, 2009 at 10:42 PM, Bryan Talbot

wrote:


For replication in 1.4, the wiki at
http://wiki.apache.org/solr/SolrReplication says that a node  
can be both

the master and a slave:

A node can act as both master and slave. In that case both the  
master and

slave configuration lists need to be present inside the
ReplicationHandler
requestHandler in the solrconfig.xml.

What does this mean?  Does the core then poll itself for updates?




No. This type of configuration is meant for "repeaters". Suppose  
there are
slaves in multiple data-centers (say da

Re: Replication master+slave

2009-05-13 Thread Otis Gospodnetic

Coincidentally, from 
http://www.cloudera.com/blog/2009/05/07/what%E2%80%99s-new-in-hadoop-core-020/ :

"Hadoop configuration files now support XInclude elements for including 
portions of another configuration file (HADOOP-4944). This mechanism allows you 
to make configuration files more modular and reusable."

So "others are doing it, too".

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Bryan Talbot 
> To: solr-user@lucene.apache.org
> Sent: Wednesday, May 13, 2009 11:26:41 AM
> Subject: Re: Replication master+slave
> 
> I see that Nobel's final comment in SOLR-1154 is that config files need to be 
> able to include snippets from external files.  In my limited testing, a 
> simple 
> patch to enable XInclude support seems to work.
> 
> 
> 
> --- src/java/org/apache/solr/core/Config.java   (revision 774137)
> +++ src/java/org/apache/solr/core/Config.java   (working copy)
> @@ -100,8 +100,10 @@
>   if (lis == null) {
> lis = loader.openConfig(name);
>   }
> -  javax.xml.parsers.DocumentBuilder builder = 
> DocumentBuilderFactory.newInstance().newDocumentBuilder();
> -  doc = builder.parse(lis);
> +  javax.xml.parsers.DocumentBuilderFactory dbf = 
> DocumentBuilderFactory.newInstance();
> +  dbf.setNamespaceAware(true);
> +  dbf.setXIncludeAware(true);
> +  doc = dbf.newDocumentBuilder().parse(lis);
> 
> DOMUtil.substituteProperties(doc, loader.getCoreProperties());
> } catch (ParserConfigurationException e)  {
> 
> 
> 
> This allows a clause like this to include the contents of replication.xml if 
> it 
> exists.  If it's not found an exception will be thrown.
> 
> 
> 
> href="http://localhost:8983/solr/corename/admin/file/?file=replication.xml";
>  xmlns:xi="http://www.w3.org/2001/XInclude";>
> 
> 
> 
> If the file is optional and no exception should be thrown if the file is 
> missing, simply include a fallback action: in this case the fallback is empty 
> and does nothing.
> 
> 
> 
> href="http://localhost:8983/solr/forum_en/admin/file/?file=replication.xml";
>  xmlns:xi="http://www.w3.org/2001/XInclude";>
> 
> 
> 
> 
> -Bryan
> 
> 
> 
> 
> On May 12, 2009, at May 12, 8:05 PM, Jian Han Guo wrote:
> 
> > I was looking at the same problem, and had a discussion with Noble. You can
> > use a hack to achieve what you want, see
> > 
> > https://issues.apache.org/jira/browse/SOLR-1154
> > 
> > Thanks,
> > 
> > Jianhan
> > 
> > 
> > On Tue, May 12, 2009 at 5:13 PM, Bryan Talbot wrote:
> > 
> >> So how are people managing solrconfig.xml files which are largely the same
> >> other than differences for replication?
> >> 
> >> I don't think it's a "good thing" to maintain two copies of the same file
> >> and I'd like to avoid that.  Maybe enabling the XInclude feature in
> >> DocumentBuilders would make it possible to modularize configuration files 
> >> to
> >> make this possible?
> >> 
> >> 
> >> 
> http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/parsers/DocumentBuilderFactory.html#setXIncludeAware(boolean)
> >> 
> >> 
> >> -Bryan
> >> 
> >> 
> >> 
> >> 
> >> 
> >> On May 12, 2009, at May 12, 11:43 AM, Shalin Shekhar Mangar wrote:
> >> 
> >> On Tue, May 12, 2009 at 10:42 PM, Bryan Talbot 
> >>>> wrote:
> >>> 
> >>> For replication in 1.4, the wiki at
> >>>> http://wiki.apache.org/solr/SolrReplication says that a node can be both
> >>>> the master and a slave:
> >>>> 
> >>>> A node can act as both master and slave. In that case both the master and
> >>>> slave configuration lists need to be present inside the
> >>>> ReplicationHandler
> >>>> requestHandler in the solrconfig.xml.
> >>>> 
> >>>> What does this mean?  Does the core then poll itself for updates?
> >>>> 
> >>> 
> >>> 
> >>> No. This type of configuration is meant for "repeaters". Suppose there are
> >>> slaves in multiple data-centers (say data center A and B). There is always
> >>> a
> >>> single master (say in A). One of the slaves in B is used as a master for
> >>> the
> >>> other slaves in B. Therefore, this one slave in B is both a master as well
> >>> as the slave.
> >>&g

Re: Replication master+slave

2009-05-13 Thread Peter Wolanin
Indeed - that looks nice - having some kind of conditional includes
would make many things easier.

-Peter

On Wed, May 13, 2009 at 4:22 PM, Otis Gospodnetic
 wrote:
>
> This looks nice and simple.  I don't know enough about this stuff to see any 
> issues.  If there are no issues.?
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> - Original Message 
>> From: Bryan Talbot 
>> To: solr-user@lucene.apache.org
>> Sent: Wednesday, May 13, 2009 11:26:41 AM
>> Subject: Re: Replication master+slave
>>
>> I see that Nobel's final comment in SOLR-1154 is that config files need to be
>> able to include snippets from external files.  In my limited testing, a 
>> simple
>> patch to enable XInclude support seems to work.
>>
>>
>>
>> --- src/java/org/apache/solr/core/Config.java   (revision 774137)
>> +++ src/java/org/apache/solr/core/Config.java   (working copy)
>> @@ -100,8 +100,10 @@
>>   if (lis == null) {
>>     lis = loader.openConfig(name);
>>   }
>> -      javax.xml.parsers.DocumentBuilder builder =
>> DocumentBuilderFactory.newInstance().newDocumentBuilder();
>> -      doc = builder.parse(lis);
>> +      javax.xml.parsers.DocumentBuilderFactory dbf =
>> DocumentBuilderFactory.newInstance();
>> +      dbf.setNamespaceAware(true);
>> +      dbf.setXIncludeAware(true);
>> +      doc = dbf.newDocumentBuilder().parse(lis);
>>
>>     DOMUtil.substituteProperties(doc, loader.getCoreProperties());
>> } catch (ParserConfigurationException e)  {
>>
>>
>>
>> This allows a clause like this to include the contents of replication.xml if 
>> it
>> exists.  If it's not found an exception will be thrown.
>>
>>
>>
>> href="http://localhost:8983/solr/corename/admin/file/?file=replication.xml";
>>          xmlns:xi="http://www.w3.org/2001/XInclude";>
>>
>>
>>
>> If the file is optional and no exception should be thrown if the file is
>> missing, simply include a fallback action: in this case the fallback is empty
>> and does nothing.
>>
>>
>>
>> href="http://localhost:8983/solr/forum_en/admin/file/?file=replication.xml";
>>          xmlns:xi="http://www.w3.org/2001/XInclude";>
>>
>>
>>
>>
>> -Bryan
>>
>>
>>
>>
>> On May 12, 2009, at May 12, 8:05 PM, Jian Han Guo wrote:
>>
>> > I was looking at the same problem, and had a discussion with Noble. You can
>> > use a hack to achieve what you want, see
>> >
>> > https://issues.apache.org/jira/browse/SOLR-1154
>> >
>> > Thanks,
>> >
>> > Jianhan
>> >
>> >
>> > On Tue, May 12, 2009 at 5:13 PM, Bryan Talbot wrote:
>> >
>> >> So how are people managing solrconfig.xml files which are largely the same
>> >> other than differences for replication?
>> >>
>> >> I don't think it's a "good thing" to maintain two copies of the same file
>> >> and I'd like to avoid that.  Maybe enabling the XInclude feature in
>> >> DocumentBuilders would make it possible to modularize configuration files 
>> >> to
>> >> make this possible?
>> >>
>> >>
>> >>
>> http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/parsers/DocumentBuilderFactory.html#setXIncludeAware(boolean)
>> >>
>> >>
>> >> -Bryan
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On May 12, 2009, at May 12, 11:43 AM, Shalin Shekhar Mangar wrote:
>> >>
>> >> On Tue, May 12, 2009 at 10:42 PM, Bryan Talbot
>> >>>> wrote:
>> >>>
>> >>> For replication in 1.4, the wiki at
>> >>>> http://wiki.apache.org/solr/SolrReplication says that a node can be both
>> >>>> the master and a slave:
>> >>>>
>> >>>> A node can act as both master and slave. In that case both the master 
>> >>>> and
>> >>>> slave configuration lists need to be present inside the
>> >>>> ReplicationHandler
>> >>>> requestHandler in the solrconfig.xml.
>> >>>>
>> >>>> What does this mean?  Does the core then poll itself for updates?
>> >>>>
>> >>>
>> >>>
>> >>> No. This type of configuration is meant for &

Re: Replication master+slave

2009-05-13 Thread Otis Gospodnetic

This looks nice and simple.  I don't know enough about this stuff to see any 
issues.  If there are no issues.?

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Bryan Talbot 
> To: solr-user@lucene.apache.org
> Sent: Wednesday, May 13, 2009 11:26:41 AM
> Subject: Re: Replication master+slave
> 
> I see that Nobel's final comment in SOLR-1154 is that config files need to be 
> able to include snippets from external files.  In my limited testing, a 
> simple 
> patch to enable XInclude support seems to work.
> 
> 
> 
> --- src/java/org/apache/solr/core/Config.java   (revision 774137)
> +++ src/java/org/apache/solr/core/Config.java   (working copy)
> @@ -100,8 +100,10 @@
>   if (lis == null) {
> lis = loader.openConfig(name);
>   }
> -  javax.xml.parsers.DocumentBuilder builder = 
> DocumentBuilderFactory.newInstance().newDocumentBuilder();
> -  doc = builder.parse(lis);
> +  javax.xml.parsers.DocumentBuilderFactory dbf = 
> DocumentBuilderFactory.newInstance();
> +  dbf.setNamespaceAware(true);
> +  dbf.setXIncludeAware(true);
> +  doc = dbf.newDocumentBuilder().parse(lis);
> 
> DOMUtil.substituteProperties(doc, loader.getCoreProperties());
> } catch (ParserConfigurationException e)  {
> 
> 
> 
> This allows a clause like this to include the contents of replication.xml if 
> it 
> exists.  If it's not found an exception will be thrown.
> 
> 
> 
> href="http://localhost:8983/solr/corename/admin/file/?file=replication.xml";
>  xmlns:xi="http://www.w3.org/2001/XInclude";>
> 
> 
> 
> If the file is optional and no exception should be thrown if the file is 
> missing, simply include a fallback action: in this case the fallback is empty 
> and does nothing.
> 
> 
> 
> href="http://localhost:8983/solr/forum_en/admin/file/?file=replication.xml";
>  xmlns:xi="http://www.w3.org/2001/XInclude";>
> 
> 
> 
> 
> -Bryan
> 
> 
> 
> 
> On May 12, 2009, at May 12, 8:05 PM, Jian Han Guo wrote:
> 
> > I was looking at the same problem, and had a discussion with Noble. You can
> > use a hack to achieve what you want, see
> > 
> > https://issues.apache.org/jira/browse/SOLR-1154
> > 
> > Thanks,
> > 
> > Jianhan
> > 
> > 
> > On Tue, May 12, 2009 at 5:13 PM, Bryan Talbot wrote:
> > 
> >> So how are people managing solrconfig.xml files which are largely the same
> >> other than differences for replication?
> >> 
> >> I don't think it's a "good thing" to maintain two copies of the same file
> >> and I'd like to avoid that.  Maybe enabling the XInclude feature in
> >> DocumentBuilders would make it possible to modularize configuration files 
> >> to
> >> make this possible?
> >> 
> >> 
> >> 
> http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/parsers/DocumentBuilderFactory.html#setXIncludeAware(boolean)
> >> 
> >> 
> >> -Bryan
> >> 
> >> 
> >> 
> >> 
> >> 
> >> On May 12, 2009, at May 12, 11:43 AM, Shalin Shekhar Mangar wrote:
> >> 
> >> On Tue, May 12, 2009 at 10:42 PM, Bryan Talbot 
> >>>> wrote:
> >>> 
> >>> For replication in 1.4, the wiki at
> >>>> http://wiki.apache.org/solr/SolrReplication says that a node can be both
> >>>> the master and a slave:
> >>>> 
> >>>> A node can act as both master and slave. In that case both the master and
> >>>> slave configuration lists need to be present inside the
> >>>> ReplicationHandler
> >>>> requestHandler in the solrconfig.xml.
> >>>> 
> >>>> What does this mean?  Does the core then poll itself for updates?
> >>>> 
> >>> 
> >>> 
> >>> No. This type of configuration is meant for "repeaters". Suppose there are
> >>> slaves in multiple data-centers (say data center A and B). There is always
> >>> a
> >>> single master (say in A). One of the slaves in B is used as a master for
> >>> the
> >>> other slaves in B. Therefore, this one slave in B is both a master as well
> >>> as the slave.
> >>> 
> >>> 
> >>> 
> >>>> I'd like to have a single set of configuration files that are shared by
> >>>> masters and slaves and avoid duplicating configuration details in
> >>>> multiple
> >>>> files (one for master and one for slave) to ease management and failover.
> >>>> Is this possible?
> >>>> 
> >>>> 
> >>> You wouldn't want the master to be a slave. So I guess you'd need to have
> >>> a
> >>> separate file. Also, it needs to be a separate file so that the slave does
> >>> not become a master when the solrconfig.xml is replicated.
> >>> 
> >>> 
> >>> 
> >>>> When I attempt to setup a multi server master-slave configuration and
> >>>> include both master and slave replication configuration options, I into
> >>>> some
> >>>> problems.  I'm  running a nightly build from May 7.
> >>>> 
> >>>> 
> >>> Not sure what happened. Is that the url for this solr (meaning same solr
> >>> url
> >>> is master and slave of itself)? If yes, that is not a valid configuration.
> >>> 
> >>> --
> >>> Regards,
> >>> Shalin Shekhar Mangar.
> >>> 
> >> 
> >> 



Re: Replication master+slave

2009-05-13 Thread Bryan Talbot
I see that Nobel's final comment in SOLR-1154 is that config files  
need to be able to include snippets from external files.  In my  
limited testing, a simple patch to enable XInclude support seems to  
work.




--- src/java/org/apache/solr/core/Config.java   (revision 774137)
+++ src/java/org/apache/solr/core/Config.java   (working copy)
@@ -100,8 +100,10 @@
  if (lis == null) {
lis = loader.openConfig(name);
  }
-  javax.xml.parsers.DocumentBuilder builder =  
DocumentBuilderFactory.newInstance().newDocumentBuilder();

-  doc = builder.parse(lis);
+  javax.xml.parsers.DocumentBuilderFactory dbf =  
DocumentBuilderFactory.newInstance();

+  dbf.setNamespaceAware(true);
+  dbf.setXIncludeAware(true);
+  doc = dbf.newDocumentBuilder().parse(lis);

DOMUtil.substituteProperties(doc, loader.getCoreProperties());
} catch (ParserConfigurationException e)  {



This allows a clause like this to include the contents of  
replication.xml if it exists.  If it's not found an exception will be  
thrown.



http://localhost:8983/solr/corename/admin/file/?file=replication.xml 
"

 xmlns:xi="http://www.w3.org/2001/XInclude";>



If the file is optional and no exception should be thrown if the file  
is missing, simply include a fallback action: in this case the  
fallback is empty and does nothing.



http://localhost:8983/solr/forum_en/admin/file/?file=replication.xml 
"

 xmlns:xi="http://www.w3.org/2001/XInclude";>




-Bryan




On May 12, 2009, at May 12, 8:05 PM, Jian Han Guo wrote:

I was looking at the same problem, and had a discussion with Noble.  
You can

use a hack to achieve what you want, see

https://issues.apache.org/jira/browse/SOLR-1154

Thanks,

Jianhan


On Tue, May 12, 2009 at 5:13 PM, Bryan Talbot  
wrote:


So how are people managing solrconfig.xml files which are largely  
the same

other than differences for replication?

I don't think it's a "good thing" to maintain two copies of the  
same file

and I'd like to avoid that.  Maybe enabling the XInclude feature in
DocumentBuilders would make it possible to modularize configuration  
files to

make this possible?


http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/parsers/DocumentBuilderFactory.html#setXIncludeAware(boolean) 




-Bryan





On May 12, 2009, at May 12, 11:43 AM, Shalin Shekhar Mangar wrote:

On Tue, May 12, 2009 at 10:42 PM, Bryan Talbot  

wrote:


For replication in 1.4, the wiki at
http://wiki.apache.org/solr/SolrReplication says that a node can  
be both

the master and a slave:

A node can act as both master and slave. In that case both the  
master and

slave configuration lists need to be present inside the
ReplicationHandler
requestHandler in the solrconfig.xml.

What does this mean?  Does the core then poll itself for updates?




No. This type of configuration is meant for "repeaters". Suppose  
there are
slaves in multiple data-centers (say data center A and B). There  
is always

a
single master (say in A). One of the slaves in B is used as a  
master for

the
other slaves in B. Therefore, this one slave in B is both a master  
as well

as the slave.



I'd like to have a single set of configuration files that are  
shared by

masters and slaves and avoid duplicating configuration details in
multiple
files (one for master and one for slave) to ease management and  
failover.

Is this possible?


You wouldn't want the master to be a slave. So I guess you'd need  
to have

a
separate file. Also, it needs to be a separate file so that the  
slave does

not become a master when the solrconfig.xml is replicated.



When I attempt to setup a multi server master-slave configuration  
and
include both master and slave replication configuration options,  
I into

some
problems.  I'm  running a nightly build from May 7.


Not sure what happened. Is that the url for this solr (meaning  
same solr

url
is master and slave of itself)? If yes, that is not a valid  
configuration.


--
Regards,
Shalin Shekhar Mangar.








Re: Replication master+slave

2009-05-12 Thread Jian Han Guo
I was looking at the same problem, and had a discussion with Noble. You can
use a hack to achieve what you want, see

https://issues.apache.org/jira/browse/SOLR-1154

Thanks,

Jianhan


On Tue, May 12, 2009 at 5:13 PM, Bryan Talbot wrote:

> So how are people managing solrconfig.xml files which are largely the same
> other than differences for replication?
>
> I don't think it's a "good thing" to maintain two copies of the same file
> and I'd like to avoid that.  Maybe enabling the XInclude feature in
> DocumentBuilders would make it possible to modularize configuration files to
> make this possible?
>
>
> http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/parsers/DocumentBuilderFactory.html#setXIncludeAware(boolean)
>
>
> -Bryan
>
>
>
>
>
> On May 12, 2009, at May 12, 11:43 AM, Shalin Shekhar Mangar wrote:
>
>  On Tue, May 12, 2009 at 10:42 PM, Bryan Talbot > >wrote:
>>
>>  For replication in 1.4, the wiki at
>>> http://wiki.apache.org/solr/SolrReplication says that a node can be both
>>> the master and a slave:
>>>
>>> A node can act as both master and slave. In that case both the master and
>>> slave configuration lists need to be present inside the
>>> ReplicationHandler
>>> requestHandler in the solrconfig.xml.
>>>
>>> What does this mean?  Does the core then poll itself for updates?
>>>
>>
>>
>> No. This type of configuration is meant for "repeaters". Suppose there are
>> slaves in multiple data-centers (say data center A and B). There is always
>> a
>> single master (say in A). One of the slaves in B is used as a master for
>> the
>> other slaves in B. Therefore, this one slave in B is both a master as well
>> as the slave.
>>
>>
>>
>>> I'd like to have a single set of configuration files that are shared by
>>> masters and slaves and avoid duplicating configuration details in
>>> multiple
>>> files (one for master and one for slave) to ease management and failover.
>>> Is this possible?
>>>
>>>
>> You wouldn't want the master to be a slave. So I guess you'd need to have
>> a
>> separate file. Also, it needs to be a separate file so that the slave does
>> not become a master when the solrconfig.xml is replicated.
>>
>>
>>
>>> When I attempt to setup a multi server master-slave configuration and
>>> include both master and slave replication configuration options, I into
>>> some
>>> problems.  I'm  running a nightly build from May 7.
>>>
>>>
>> Not sure what happened. Is that the url for this solr (meaning same solr
>> url
>> is master and slave of itself)? If yes, that is not a valid configuration.
>>
>> --
>> Regards,
>> Shalin Shekhar Mangar.
>>
>
>


Re: Replication master+slave

2009-05-12 Thread Bryan Talbot
So how are people managing solrconfig.xml files which are largely the  
same other than differences for replication?


I don't think it's a "good thing" to maintain two copies of the same  
file and I'd like to avoid that.  Maybe enabling the XInclude feature  
in DocumentBuilders would make it possible to modularize configuration  
files to make this possible?


http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/parsers/DocumentBuilderFactory.html#setXIncludeAware(boolean)


-Bryan




On May 12, 2009, at May 12, 11:43 AM, Shalin Shekhar Mangar wrote:

On Tue, May 12, 2009 at 10:42 PM, Bryan Talbot  
wrote:



For replication in 1.4, the wiki at
http://wiki.apache.org/solr/SolrReplication says that a node can be  
both

the master and a slave:

A node can act as both master and slave. In that case both the  
master and
slave configuration lists need to be present inside the  
ReplicationHandler

requestHandler in the solrconfig.xml.

What does this mean?  Does the core then poll itself for updates?



No. This type of configuration is meant for "repeaters". Suppose  
there are
slaves in multiple data-centers (say data center A and B). There is  
always a
single master (say in A). One of the slaves in B is used as a master  
for the
other slaves in B. Therefore, this one slave in B is both a master  
as well

as the slave.




I'd like to have a single set of configuration files that are  
shared by
masters and slaves and avoid duplicating configuration details in  
multiple
files (one for master and one for slave) to ease management and  
failover.

Is this possible?



You wouldn't want the master to be a slave. So I guess you'd need to  
have a
separate file. Also, it needs to be a separate file so that the  
slave does

not become a master when the solrconfig.xml is replicated.




When I attempt to setup a multi server master-slave configuration and
include both master and slave replication configuration options, I  
into some

problems.  I'm  running a nightly build from May 7.



Not sure what happened. Is that the url for this solr (meaning same  
solr url
is master and slave of itself)? If yes, that is not a valid  
configuration.


--
Regards,
Shalin Shekhar Mangar.




Re: Replication master+slave

2009-05-12 Thread Shalin Shekhar Mangar
On Tue, May 12, 2009 at 10:42 PM, Bryan Talbot wrote:

> For replication in 1.4, the wiki at
> http://wiki.apache.org/solr/SolrReplication says that a node can be both
> the master and a slave:
>
> A node can act as both master and slave. In that case both the master and
> slave configuration lists need to be present inside the ReplicationHandler
> requestHandler in the solrconfig.xml.
>
> What does this mean?  Does the core then poll itself for updates?


No. This type of configuration is meant for "repeaters". Suppose there are
slaves in multiple data-centers (say data center A and B). There is always a
single master (say in A). One of the slaves in B is used as a master for the
other slaves in B. Therefore, this one slave in B is both a master as well
as the slave.


>
> I'd like to have a single set of configuration files that are shared by
> masters and slaves and avoid duplicating configuration details in multiple
> files (one for master and one for slave) to ease management and failover.
>  Is this possible?
>

You wouldn't want the master to be a slave. So I guess you'd need to have a
separate file. Also, it needs to be a separate file so that the slave does
not become a master when the solrconfig.xml is replicated.


>
> When I attempt to setup a multi server master-slave configuration and
> include both master and slave replication configuration options, I into some
> problems.  I'm  running a nightly build from May 7.
>

Not sure what happened. Is that the url for this solr (meaning same solr url
is master and slave of itself)? If yes, that is not a valid configuration.

-- 
Regards,
Shalin Shekhar Mangar.


Replication master+slave

2009-05-12 Thread Bryan Talbot
For replication in 1.4, the wiki at http://wiki.apache.org/solr/SolrReplication 
 says that a node can be both the master and a slave:


A node can act as both master and slave. In that case both the master  
and slave configuration lists need to be present inside the  
ReplicationHandler requestHandler in the solrconfig.xml.


What does this mean?  Does the core then poll itself for updates?

I'd like to have a single set of configuration files that are shared  
by masters and slaves and avoid duplicating configuration details in  
multiple files (one for master and one for slave) to ease management  
and failover.  Is this possible?


When I attempt to setup a multi server master-slave configuration and  
include both master and slave replication configuration options, I  
into some problems.  I'm  running a nightly build from May 7.



  

  commit


  http://master_core01:8983/solr/core01/ 
replication

  00:00:60

  


When the replication admin page (http://master_core01:8983/solr/core01/ 
admin/replication/index.jsp) is visited, the severe error show below  
appears in the solr log.  The server is otherwise idle so there is no  
reason all threads should be busy unless the replication code is  
getting itself into a loop.


What's the right way to do this?



May 11, 2009 8:01:22 PM org.apache.tomcat.util.threads.ThreadPool  
logFull
SEVERE: All threads (150) are currently busy, waiting. Increase  
maxThreads (150) or check the servlet status
May 11, 2009 8:01:41 PM org.apache.solr.handler.ReplicationHandler  
getReplicationDetails

WARNING: Exception while invoking a 'details' method on master
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java: 
218)
at java.io.BufferedInputStream.read(BufferedInputStream.java: 
237)
at  
org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78)
at  
org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106)
at  
org 
.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java: 
1116)
at  
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager 
$ 
HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java: 
1413)
at  
org 
.apache 
.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java: 
1973)
at  
org 
.apache 
.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java: 
1735)
at  
org 
.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java: 
1098)
at  
org 
.apache 
.commons 
.httpclient 
.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
at  
org 
.apache 
.commons 
.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java: 
171)
at  
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java: 
397)
at  
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java: 
323)
at  
org 
.apache.solr.handler.SnapPuller.getNamedListResponse(SnapPuller.java: 
183)
at  
org.apache.solr.handler.SnapPuller.getCommandResponse(SnapPuller.java: 
178)
at  
org 
.apache 
.solr 
.handler 
.ReplicationHandler.getReplicationDetails(ReplicationHandler.java:555)
at  
org 
.apache 
.solr 
.handler.ReplicationHandler.handleRequestBody(ReplicationHandler.java: 
147)
at  
org 
.apache 
.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java: 
131)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1330)
at  
org 
.apache.jsp.admin.replication.index_jsp.executeCommand(index_jsp.java: 
34)
at  
org.apache.jsp.admin.replication.index_jsp._jspService(index_jsp.java: 
208)
at  
org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:98)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
at  
org 
.apache 
.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:331)
at  
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:329)
at  
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:265)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
at  
org 
.apache 
.catalina 
.core 
.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java: 
269)
at  
org 
.apache 
.catalina 
.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
at  
org 
.apache 
.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java: 
679)
at  
org 
.apache 
.catalina 
.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java: 
461)
at  
org 
.apache 
.catalina 
.core.ApplicationDispatcher.doForward(ApplicationDispatcher.java:399)
at  
org 
.apache 
.catalina 
.core.Appl