Cant thank you enough for clarifying it at length. Yeah its pretty confusing even for experienced Solr users. I used the upconfig and linkconfig commands to update 4 collections into zookeeper...As you described, I lucked out as I used the same name for configset and the collection and hence did not have to use the collections API :-)
Thanks, Ravi Kiran Bhaskar On Sat, Sep 19, 2015 at 11:22 PM, Erick Erickson <erickerick...@gmail.com> wrote: > Let's back up a second. Configsets are what _used_ to be in the conf > directory for each core on a local drive, it's just that they're now > kept up on Zookeeper. Otherwise, you'd have to put them on each > instance in SolrCloud, and bringing up a new replica on a new machine > would look a lot like adding a core with the old core admin API. > > So instead, configurations are kept on zookeeper. A config set > consists of, essentially, a named old-style "conf" directory. There's > no a-priori limit to the number of config sets you can have. Look in > the admin UI, Cloud>>tree>>configs and you'll see each name you've > pushed to ZK. If you explore that tree, you'll see a lot of old > familiar faces, schema.xml, solrconfig.xml, etc. > > So now we come to associating configs with collections. You've > probably done one of the examples where some things happen under the > covers, including explicitly pushing the configset to Zookeeper. > Currently, there's no option in the bin/solr script to push a config, > although I know there's a JIRA to do that. > > So, to put a new config set up you currently need to use the zkCli.sh > script see: > https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities, > the "upconfig" command. That pushes the configset up to ZK and gives > it a name. > > Now, you create a collection and it needs a configset stored in ZK. > It's a little tricky in that if you do _not_ explicitly specify a > configest (using the collection.configName parameter to the > collections API CREATE command), then by default it'll look for a > configset with the same name as the collection. If it doesn't find > one, _and_ there is one and only one configset, then it'll use that > one (personally I find that confusing, but that's the way it works). > See: https://cwiki.apache.org/confluence/display/solr/Collections+API > > If you have two or more configsets in ZK, then either the configset > name has to be identical to the collection name (if you don't specify > collection.configName), _or_ you specify collection.configName at > create time. > > NOTE: there are _no_ config files on the local disk! When a replica of > a collection loads, it "knows" what collection it's part of and pulls > the corresponding configset from ZK. > > So typically the process is this. > > you create the config set by editing all the usual suspects, schema.xml, > solrconfig.xml, DIH config etc. > > you put those configuration files into some version control system (you > are using one, right?) > > you push the configs to Zookeeper > > you create the collection > > you figure out you need to change the configs so you > > check the code out of your version control > > edit them > > put the current version back into version control > > push the configs up to zookeeper, overwriting the ones already > there with that name > > reload the collection or bounce all the servers. As each replica > in the collection comes up, > it downloads the latest configs from Zookeeper to memory (not to > disk) and uses them. > > Seems like a long drawn-out process, but pretty soon it's automatic. > And really, the only extra step is the push to Zookeeper, the rest is > just like old-style cores with the exception that you don't have to > manually push all the configs to all the machines hosting cores. > > Notice that I have mostly avoided talking about "cores" here. Although > it's true that a replica in a collection is just another core, it's > "special" in that it has certain very specific properties set. I > _strongly_ advise you stop thinking about old-style Solr cores and > instead thing about collections and replicas. And above all, do _not_ > use the admin core API to try to create members of a collection > (cores), use the collections API to ADDREPLICA/DELETEREPLICA instead. > Loading/unloading cores is less "fraught", but I try to avoid that too > and use > > Best, > Erick > > On Sat, Sep 19, 2015 at 9:08 PM, Ravi Solr <ravis...@gmail.com> wrote: > > Thanks Erick, I will report back once the reindex is finished. Oh, your > > answer reminded me of another question - Regarding configsets the > > documentation says > > > > "On a multicore Solr instance, you may find that you want to share > > configuration between a number of different cores." > > > > Can the same be used to push disparate mutually exclusive configs ? I ask > > this as I have 4 mutually exclusive apps each with a 4 single core index > on > > a single machine which I am trying to convert to SolrCloud with single > > shard approach. Just being lazy and trying to find a way to update and > link > > configs to zookeeper ;-) > > > > Thanks > > > > Rvai Kiran Bhaskar > > > > On Sat, Sep 19, 2015 at 6:54 PM, Erick Erickson <erickerick...@gmail.com > > > > wrote: > > > >> Just pushing up the entire configset would be easiest, but the > >> Zookeeper command line tools allow you to push up a single > >> file if you want. > >> > >> Yeah, it puzzles me too that the import worked yesterday, not really > >> sure what happened, the file shouldn't just disappear.... > >> > >> Erick > >> > >> On Sat, Sep 19, 2015 at 2:46 PM, Ravi Solr <ravis...@gmail.com> wrote: > >> > Thank you for the prompt response Erick. I did a full-import > yesterday, > >> you > >> > are correct that I did not push dataimport.properties to ZK, should it > >> have > >> > not worked even for a full import ?. You may be right about 'clean' > >> option, > >> > I will reindex again today. BTW how do we push a single file to a > >> specific > >> > config name in zookeeper ? > >> > > >> > > >> > Thanks, > >> > > >> > Ravi Kiran Bhaskar > >> > > >> > > >> > On Sat, Sep 19, 2015 at 1:48 PM, Erick Erickson < > erickerick...@gmail.com > >> > > >> > wrote: > >> > > >> >> Could not read DIH properties from > >> >> /configs/sitesearchcore/dataimport.properties > >> >> > >> >> This looks like somehow you didn't push this file up to Zookeeper. > You > >> >> can check what files are there in the admin UI. How you indexed > >> >> yesterday is a mystery though, unless somehow this file was removed > >> >> from ZK. > >> >> > >> >> As for why you lost all the docs, my suspicion is that you have the > >> >> clean param set up for delta import.... > >> >> > >> >> FWIW, > >> >> Erick > >> >> > >> >> On Sat, Sep 19, 2015 at 10:36 AM, Ravi Solr <ravis...@gmail.com> > wrote: > >> >> > I am facing a weird problem. As part of upgrade from 4.7.2 > >> (Master-Slave) > >> >> > to 5.3.0 (Solrcloud) I re-indexed 1.5 million records via DIH using > >> >> > SolrEntityProcessor yesterday, all of them indexed properly. Today > >> >> morning > >> >> > I just ran the DIH again with delta import and I lost all > docs...what > >> am > >> >> I > >> >> > missing ? Did anybody face similar issue ? > >> >> > > >> >> > Here are the errors in the logs > >> >> > > >> >> > 9/19/2015, 2:41:17 AM ERROR null SolrCore Previous SolrRequestInfo > was > >> >> not > >> >> > closed! > >> >> > req=waitSearcher=true&distrib.from= > >> >> > >> > http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false > >> >> > 9/19/2015, > >> >> > 2:41:17 AM ERROR null SolrCore prev == info : false 9/19/2015, > >> 2:41:17 AM > >> >> > WARN null ZKPropertiesWriter Could not read DIH properties from > >> >> > /configs/sitesearchcore/dataimport.properties :class > >> >> > org.apache.zookeeper.KeeperException$NoNodeException > >> >> > > >> >> > org.apache.zookeeper.KeeperException$NoNodeException: > KeeperErrorCode > >> >> > = NoNode for /configs/sitesearchcore/dataimport.properties > >> >> > at > >> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > >> >> > at > >> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > >> >> > at > org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155) > >> >> > at > >> >> > org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:349) > >> >> > at > >> >> > >> > org.apache.solr.handler.dataimport.ZKPropertiesWriter.readIndexerProperties(ZKPropertiesWriter.java:91) > >> >> > at > >> >> > >> > org.apache.solr.handler.dataimport.ZKPropertiesWriter.persist(ZKPropertiesWriter.java:65) > >> >> > at > >> >> > >> > org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:307) > >> >> > at > >> >> > >> > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:253) > >> >> > at > >> >> > >> > org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416) > >> >> > at > >> >> > >> > org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480) > >> >> > at > >> >> > >> > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461) > >> >> > > >> >> > 9/19/2015, 11:16:43 AM ERROR null SolrCore Previous SolrRequestInfo > >> was > >> >> not > >> >> > closed! > >> >> > req=waitSearcher=true&distrib.from= > >> >> > >> > http://10.128.159.32:8983/solr/sitesearchcore/&update.distrib=FROMLEADER&openSearcher=true&commit=true&wt=javabin&expungeDeletes=false&commit_end_point=true&version=2&softCommit=false > >> >> > 9/19/2015, > >> >> > 11:16:43 AM ERROR null SolrCore prev == info : false > >> >> > > >> >> > > >> >> > > >> >> > Thanks > >> >> > > >> >> > Ravi Kiran Bhaskar > >> >> > >> >