[jira] [Commented] (SOLR-1395) Integrate Katta

JohnWu (JIRA) Thu, 26 May 2011 19:30:31 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13040037#comment-13040037
 ]


JohnWu commented on SOLR-1395:
------------------------------

Stefan Groschupf,Eric Pugh,Pulkit Agrawal and other peoples:

today, I upload a figure for 1395 patch, and describe why we use this patch.  
(solr1395.jpg                 27/May/11 01:00 226 kB)

the whole ARCH contains 3 levels:
1) Solr level
2) katta slave node level
3) Hadoop level

all the frameworks give web app server, distribute search, data store and 
MapReduce features to lucene.

I will give you a detail document about how to use this patch, now, just say 
the process of a coarse-grained 

a) build the environment

   1)install Hadoop (ensure you can browse your files in HDFS)
   2)install Zookeeper (ensure you can zkCli.sh connect the server)
   3)install katta (ensure masternode and datanode can run, use "katta check" 
the shard and "start Master -ne" means you use unembeded style to  satrt the 
katta master)

b) patch the solr

   1) trunk the code form http://svn.apache.org/repos/asf/lucene/dev/trunk 
   (tom liu added a comment - 20/Oct/10 06:19)

   2) add the patch, manual patch some code if reject  
   (solr-1395-katta-0.6.2-3.patch       10/Nov/10 02:12 108 kB)

   3) correct the code of queryComponent(solr)

   //JohnWu correct the && to ||, need decide the shards is null
   if (shards == null){ hasShardURL = false; }else{ hasShardURL = shards != 
null || shards.indexOf('/') > 0; }


c) query and config the system (katta-solrcores.jpg    03/Dec/10 03:44    94 kB)
   
    1) web container (tomcat) start the solr server as the figure showed the 
proxy, you need correct the solrconfig.xml

       <requestHandler name="standard" class="solr.KattaRequestHandler" 
default="true">
          <lst name="defaults">
            <str name="echoParams">explicit</str>
            <str name="shards">*</str>
          </lst>
       </requestHandler>
    
     the solr will use the kattaclient to dispatch the query to subproxy nodes 
(katta datanodes)
   
    2) katta datanode start with embeded solr

    correct the katta sh script as follows:
    KATTA_OPTS="$KATTA_OPTS -Dsolr.home=/var/data/solr 
-Dsolr.directoryFactory=solr.MMapDirectoryFactory"
    
    add the "zookeeper.servers=localhost:2181" and "zookeeper.embedded=false" 
in katta.zk.properties, put this file in your class path
    
    the proxy solr config, you need correct the solrconfig.xml as follows:
    <requestHandler name="standard" class="solr.SearchHandler" 
default="true">...</requestHandler>

    3) deploy your queryCore.zip(the folder hieracy, please look TomLiu 
comments) with "katta addIndex queryCore**.zip hdfs://******"
    deployed queryCore has the solrconfig.xml as follows:
    <requestHandler name="standard" class="solr.MultiEmbeddedSearchHandler" 
default="true">...</requestHandler>

    4)use the follows query
    
http://localhost:8080/solr-1395-katta-0.6.2-2patch/select/?q=apple&version=2.2&start=0&rows=10&indent=on&isShard=false&distrib=true

    the index you can use the example of solr1.4.

    ok, the hits return :
    
    <result name="response" numFound="1" start="0">
    \u2212
     <doc>
         <str name="id">MA147LL/A</str>
        <str name="name">Apple 60 GB iPod with Video Playback Black</str>
        <str name="manu">Apple Computer Inc.</str>
    \u2212

d) some amazing things

If one node crashed, the other nodes will still run.The system will redeploy a 
set of index to a new node, keep the system stable on fly and replicate number 
is a fixed number.

Summary

If you use the patch, you need read all the comments as the order of Date.
If you want a flexible structure, please use this patch.
If you want use the solr multicore, please use this patch.

Thanks to TomLiu, Jason Rutherglen and Jason Venner.

Thanks alot!

JohnWu






















1) you need cvs the code 


> Integrate Katta
> ---------------
>
>                 Key: SOLR-1395
>                 URL: https://issues.apache.org/jira/browse/SOLR-1395
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.4
>            Reporter: Jason Rutherglen
>            Priority: Minor
>             Fix For: 3.2
>
>         Attachments: SOLR-1395.patch, SOLR-1395.patch, SOLR-1395.patch, 
> back-end.log, front-end.log, hadoop-core-0.19.0.jar, katta-core-0.6-dev.jar, 
> katta-solrcores.jpg, katta.node.properties, katta.zk.properties, 
> log4j-1.2.13.jar, solr-1395-1431-3.patch, solr-1395-1431-4.patch, 
> solr-1395-1431-katta0.6.patch, solr-1395-1431-katta0.6.patch, 
> solr-1395-1431.patch, solr-1395-katta-0.6.2-1.patch, 
> solr-1395-katta-0.6.2-2.patch, solr-1395-katta-0.6.2-3.patch, 
> solr-1395-katta-0.6.2.patch, solr1395.jpg, test-katta-core-0.6-dev.jar, 
> zkclient-0.1-dev.jar, zookeeper-3.2.1.jar
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> We'll integrate Katta into Solr so that:
> * Distributed search uses Hadoop RPC
> * Shard/SolrCore distribution and management
> * Zookeeper based failover
> * Indexes may be built using Hadoop

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-1395) Integrate Katta

Reply via email to