loading dataimport configs
I am using the data importer that feeds off of mysql. When adding new DataImportHandler requestHandles to solrconfig.xml, I can upload my changes with the following command: ./zkcli.sh -zkhost 10.0.1.107:2181 -cmd upconfig -confdir configs -confname collection1 Good: I can see the changed files in zookeeper. I can get them and see that the contents changed as well. Bad: When I call curl http://localhost:8983/solr/collection1/data-point-1?command=full-import or browse to http://solr-ip/solr/#/collection1/dataimport/data-point-1 solr is complaining that the config for data-point-1 (for example) cannot be found. Any ideas what I might be doing wrong? -- CTO Zenlok株式会社
Fwd: Zookeeper dataimport.properties node
- Is dataimport.properties ever written to the filesystem? (Trying to determine if I have a permissions error because I don't see it anywhere on disk). - How do you manually edit dataimport.properties? My system is periodically pulling in new data. If that process has issues, I want to be able to reset to an earlier known good timestamp value. Regards, Nate
Zookeeper dataimport.properties node
- Is dataimport.properties ever written to the filesystem? (Trying to determine if I have a permissions error because I don't see it anywhere on disk). - How do you manually edit dataimport.properties? My system is periodically pulling in new data. If that process has issues, I want to be able to reset to an earlier known good timestamp value. Regards, Nate -- CTO Zenlok株式会社
SOLR Num Docs vs NumFound
On my solr 4 setup a query returns a higher NumFound value during a *:* query than the Num Docs value reported on the statistics page of collection1. Why is that? My data is split across 3 data import handlers where each handler has the same type of data but the ids are guaranteed to be different. Are some of my documents not hard commited? If so, how do I hard commit. Otherwise, why are these numbers different? -- CTO Zenlok株式会社
Zookeeper and DataImportHandler properties
I realize this is not a zookeeper specific mailing list, but I am wondering if anybody has a simple process for updating zookeeper files other than restarting a solr instance? Specifically the data-import.properties value, which doesn't appear to be written to disk, but, rather, only exists in zookeeper itself. How can I edit this value? I am unfamiliar with zkCli.sh and am not sure how to add new lines to manually entered set commands. Regards, Nate -- CTO Zenlok株式会社
Re: Quick Questions
On 03/08/2013 05:06 PM, Upayavira wrote: In example/cloud-scripts/ you will find a Solr specific zkCli tool to upload/download configs. You will need to reload a core/collection for the changes to take effect. Upayavira On Fri, Mar 8, 2013, at 07:02 AM, Nathan Findley wrote: I am setting up solrcloud with zookeeper. - I am wondering if there are nicer ways to update the zookeeper config files (data-import) besides restarting a node with the boostrap option? - Right now I kill the node manually in order to restart it. Is there a better way to restart? Thanks, Nate -- CTO Zenlok株式会社 Ok that is good to know. Using zookeeper I can see the following dataimport.properties: last_index_time=2013-03-06 12\:02\:22 email_history.last_index_time=2013-03-06 12\:02\:22 ... The problem is that the last_index_time is not being changed when I run a delta import. Any ideas why? If it is a permissions issue, I am a bit confused because I am testing using the root user and don't see any errors to indicate that zookeeper is failing to write to the filesystem. Thanks, Nate -- CTO Zenlok株式会社
Quick Questions
I am setting up solrcloud with zookeeper. - I am wondering if there are nicer ways to update the zookeeper config files (data-import) besides restarting a node with the boostrap option? - Right now I kill the node manually in order to restart it. Is there a better way to restart? Thanks, Nate -- CTO Zenlok株式会社
Re: Solr 3.6 - 4.0
Otis, I believe I found the thread which contains a link about elasticsearch and big data. http://www.elasticsearch.org/videos/2012/06/05/big-data-search-and-analytics.html We are dealing with data that is searched using time ranges. Does the time data flow concept work in SOLR? Does it mean I can be adding shards to the existing collection and have it just work? If this concept is more readily used in Elasticsearch, I have no problem with using that instead of SOLR. We need to be able to maintain searches across shards whatever the case may be. Thanks for your time, Nate On 11/05/2012 03:16 AM, Otis Gospodnetic wrote: Correct. There was a good thread on this topic on the ElasticSearch ML. Search for oversharding and my name. Same ideas apply to SolrCloud. Neither server offer automatic rebalancing yet, though ES lets you move shards around on demand. Otis -- Performance Monitoring - http://sematext.com/spm On Nov 4, 2012 12:20 PM, Nathan Findley nat...@zenlok.com wrote: Otis, Thanks for that makes sense. I have one more question: at this point the only way for future expansion of shard count is by having more than one shard per machine and then, when things grow, moving each shard to its own dedicated machine? That is how I understand it from the wiki. So for instance I could have 10 shards where 2 machines have 5 shards each. Then I could move those shards to their own machines as the index grows. Is this correct? Does it apply to replicas as well (5 per 2 replica machines)? Finally being able to add more shards is something on the feature list? Regards, Nate On 11/03/2012 10:11 PM, Otis Gospodnetic wrote: Hi, Check the archive for a similar QA yesterday. Reindexing would be the cleanest. Otis -- Performance Monitoring - http://sematext.com/spm On Nov 3, 2012 8:22 AM, Nathan Findley nat...@zenlok.com wrote: Hi all, I have one machine running solr 3.6. I would like to move this data to solr 4.0 and set up a solrcloud. I feel like I should replicate the existing data. After that, it isn't clear to me what I need to do. 1) Create a slave (4.0) that replicates from the master (3.6). 2) Somehow turn the slave into a part of a solrcloud? If there are any online articles about this process or you have any suggestions, I would appreciate it! Thanks, Nate -- ??/? Zenlok
Solr 3.6 - 4.0
Hi all, I have one machine running solr 3.6. I would like to move this data to solr 4.0 and set up a solrcloud. I feel like I should replicate the existing data. After that, it isn't clear to me what I need to do. 1) Create a slave (4.0) that replicates from the master (3.6). 2) Somehow turn the slave into a part of a solrcloud? If there are any online articles about this process or you have any suggestions, I would appreciate it! Thanks, Nate