Re: SolrCloud is sick.

Martin Gainty Sun, 03 Nov 2019 16:01:04 -0800

here is a bug i cannot shake in when building lucne/site

inside lucene/src/main/xml/ENTITY_TermQuery.xml


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE TermQuery [
<!ENTITY internalTerm "sumitomo">
<!ENTITY externalTerm SYSTEM "http://www.bar.xyz/external";>
<!ENTITY % myParameterEntity "http://www.bar.xyz/param";>
....

using ant build.xml:
 <!--
      The XSL input file is ignored completely, but XSL expects one to be given,
      so we pass ourself (${ant.file}) here. The list of module build.xmls is 
given
      via string parameter, that must be splitted by the XSL at '|'.
    -->
    <xslt in="${ant.file}" out="${javadoc.dir}/index.html" 
style="site/xsl/index.xsl" force="true">
      <outputproperty name="method" value="html"/>
      <outputproperty name="version" value="4.0"/>
      <outputproperty name="encoding" value="UTF-8"/>
      <outputproperty name="indent" value="yes"/>
      <param name="buildfiles" expression="${process-webpages.buildfiles}"/>
      <param name="version" expression="${version}"/>
      <param name="defaultCodec" expression="${defaultCodec}"/>
    </xslt>

OR maven pom.xml
  <plugin>
        <groupId>org.codehaus.mojo</groupId>
   <artifactId>xml-maven-plugin</artifactId>
           <version>1.0.1</version>
           <executions>
            <execution>
                 <id>validate</id>
                        <phase>initialize</phase>
                        <goals>
                         <goal>transform</goal>
                        </goals>
                        <configuration>
                           <forceCreation>true</forceCreation>
                           <skip>false</skip>
                           
<outputDirectory>${project.build.directory}/target</outputDirectory>
     <transformationSets>
       <transformationSet>
  <dir>src/main/xml</dir>
  <stylesheet>C:/Maven-plugin/lucene-solr/lucene/site/xsl/index.xsl</stylesheet>
  <parameters>
   <parameter>
     <name>MyParam</name>
     <value>true</value>
   </parameter>
       </parameters>
       </transformationSet>
     </transformationSets>
           </configuration>
           </execution>
       </executions>
       <dependencies>
        <dependency>
         <groupId>net.sf.saxon</groupId>
         <artifactId>Saxon-HE</artifactId>
         <version>9.9.1-1</version>
        </dependency>
       </dependencies>
      </plugin>

either build executing XSLT i get the same error:

[ERROR] Failed to execute goal 
org.codehaus.mojo:xml-maven-plugin:1.0.1:transform (validate) on project 
analysis: Failed to transform input file 
lucene/src/main/xml/ENTITY_TermQuery.xml: I/O error reported by XML parser 
processing file://lucene/src/main/xml/ENTITY_TermQuery.xml: www.bar.xyz:
Unknown host www.bar.xyz
]>

apparently www.bar.xyz<http://www.bar.xyz> host is supposed to be a placeholder
but for the life of me I cannot see where www.bar.zyz<http://www.bar.zyz> 
placeholder is replaced by a valid URL

(i havent used DTD in at least 10 years and i am way out of my element when 
trying to resolve)
any suggestions?
martin
________________________________
From: David Smiley <david.w.smi...@gmail.com>
Sent: Sunday, November 3, 2019 12:32 AM
To: Solr/Lucene Dev <dev@lucene.apache.org>
Cc: Mark Miller <markrmil...@gmail.com>
Subject: Re: SolrCloud is sick.

Yeah we do a bad job of the things you listed Noble.  :-(   My colleagues want 
pointers to internal docs but the sad reality is there isn't any.  You may 
notice I'm a stickler in my code reviews for requiring javadocs on all top 
level classes.  I think more javadocs and code comments would be very helpful 
-- especially for the major classes.  This might help us all and others a lot 
more.  For example I think Lucene does a rather fine job of this for its major 
classes -- IndexWriter being a good example.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Sat, Nov 2, 2019 at 7:32 PM Noble Paul 
<noble.p...@gmail.com<mailto:noble.p...@gmail.com>> wrote:
Hi,

I believe there is a consensus on what is wrong with the way we have built the 
cluster state and overseer. We need to focus a bit more on the design aspect. 
Design, according to me, has the following elements:

* How does it work?

* What are the performance characteristics? Can it be done more efficiently?

* What are the public touch points?

** Which are the files we store in ZK? Are they expected to be watched always?

** Or are they read on demand?

** The public APIs. Does it make sense to the user? Can it be further 
simplified? How does it compare to the other APIs in the system?


We, as a community, do a bad job in dealing with these. While we focus on 
internal things, these are not discussed before it is too late. We usually do 
coding, tests, code review (sometimes) and commit. This leads to huge technical 
debt.


This is not to put blame on one person or a group of people. (I occasionally 
see people discussing design issues upfront, I just hope that is the norm.)


Now, why am I discussing this in this thread?


While we agree there are problems, we are trying to solve the problem using the 
same process we used to create these problems. Again, I'm not questioning the 
intent or competence of anyone. Unless we set the process right, we are doomed 
to make the same mistakes again.


I whole heartedly endorse any effort to improve SolrCloud/overseer. At the same 
time I fail to see us leveraging the collective experience of our community 
through meaningful discussion.


I hope we don't resort to personal attacks and use this as an opportunity to 
improve our processes.
Thanks

On Sun, Nov 3, 2019, 9:52 AM Scott Blum 
<dragonsi...@gmail.com<mailto:dragonsi...@gmail.com>> wrote:
Very much agreed.  I've been trying to figure out for a long time what is the 
point in having a replica DOWN state that has to be toggled (DOWN and then UP!) 
every time a node restarts.  Considering that we could just combine ACTIVE and 
`live_nodes` to understand whether a replica is available.  It's not even 
foolproof since kill -9 on a solr node won't mark all the replicas DOWN-- that 
doesn't happen until the node comes back up (perversely).

What would it take to get to a state where restarting a node would require a 
minimal amount of ZK work in most cases?

On Sat, Nov 2, 2019 at 5:44 PM Mark Miller 
<markrmil...@gmail.com<mailto:markrmil...@gmail.com>> wrote:
Give me a short bit to follow up and I will lay out my case and proposal.

Everyone is then free to decide that we need to do something drastic or that 
I'm wrong and we should just continue down the same road. If that's the case, a 
lot of your work will get a lot easier and less impeded by me and we will still 
all be happier. Win win.

If we can just not make drastic changes for a just a brief week or so window, 
I'll say what I have to say, you guys can judge and do whatever you'd please.

- mark

On Fri, Nov 1, 2019 at 7:46 PM Mark Miller 
<markrmil...@gmail.com<mailto:markrmil...@gmail.com>> wrote:
Hey All Solr Dev's,

SolrCloud is sick right now. The way low level Zookeeper is handeled, the 
Overseer, is mix and mess of proper exception handling and super slow startup 
and shutdown, adding new things all the time with no concern for performance or 
proper ordering (which is harder to tell than you think).

Our class dependency graph doesn't even work - we just force it. Sort of. If 
the whole system  doesn't block and choke it's way to a start slow enough, lots 
of things fail.

This thing coughs up, you toss stuff into the storm, a good chunk of time, what 
you want eventually come back without causing too much damage.

There are so many things are are off or just plain wrong and the list is 
growing and growing. No one is following this or if you are, please back me up. 
This thing will collapse under it's own wait.

So if you want to add yet another state format cluster state or some other 
optimization on this junk heap, you can expect me to push back.

We should all be embarrassed by the state of things.

I've got some ideas for addressing them that I'll share soon, but god, don't 
keep optimizing a turd in non backcompat Overseer loving ways. That Overseer is 
an atrocity.

--
- Mark

http://about.me/markrmiller


--
- Mark

http://about.me/markrmiller

Re: SolrCloud is sick.

Reply via email to