ManagedIndexSchema takes long for larger schema changes

2020-12-10 Thread Tiziano Degaetano
Hello,

I was checking why my initial schema change is taking several minutes using the 
managed schema api.
VisualVm shows that most of the time is used in 
ManagedIndexSchema.postReadInform

[cid:image001.png@01D6CEE9.16DA5EC0]

Looking at the code shows that postReadInform is executed for every 
modification, and performs an inform on all fields.
At the end inform is called ChagesToSchema * Fields times.

I prepared a PR that changes the flow to only postReadInform once after the 
changes are done.
improve speed of large schema changes for ManagedIndexSchema · 
tizianodeg/lucene-solr@54d2161 · 
GitHub<https://github.com/tizianodeg/lucene-solr/commit/54d2161c8192c7f08e705d33f191b5cd9a087cd5>

this can dramatically decrease managed schema change from several minutes to 1 
sec [cid:image002.png@01D6CEEE.143FEB80]

I’m not sure if setLatestSchema is the right place to do the final call to 
postReadInform and also unsure if making the postReadInform public is 
acceptable.
How can I propose such an improvement? – Or should I open a Bug request for 
this?

Kind Regards,
Tiziano






Restore of Solr Collection with schema changes ?

2018-05-23 Thread Satyanarayana Kalahasthi
Hi Team,

Is the back-up of Solr collection is possible if there are any schema changes 
(addition of new field or deletion of existing field in managed-schema.xml) 
during restore? I am not a registered user, but please do the needful.

Thanks
Kalahasthi Satyanarayana
Mobile : 08884581161


The information contained in this email message and any attachments is 
confidential and intended only for the addressee(s). If you are not an 
addressee, you may not copy or disclose the information, or act upon it, and 
you should delete it entirely from your email system. Please notify the sender 
that you received this email in error.


Re: Schema Changes

2016-07-28 Thread Anshum Gupta
Hi Ethan,

If the new fields are something that the old documents are also supposed to
contain, you would need to reindex. e.g. in case you add a new copy field
or a new field in general that your raw document contains, you would need
to reindex.
If the new field would only be something that exists in future documents,
you wouldn't need to reindex.

-Anshum

On Thu, Jul 28, 2016 at 12:50 PM, Ethan  wrote:

> Hi,
>
> We change our schema to add new fields 3-4 times a year.  Never modify
> existing fields.
>
> Some of my colleagues say it requires re-indexing. Does it?  None of the
> existing field has changed.  schema.xml is the only file that s modified.
> So what's the point in re-indexing?
>
> Appreciate any insight.
>
> Thanks
>



-- 
Anshum Gupta


Schema Changes

2016-07-28 Thread Ethan
Hi,

We change our schema to add new fields 3-4 times a year.  Never modify
existing fields.

Some of my colleagues say it requires re-indexing. Does it?  None of the
existing field has changed.  schema.xml is the only file that s modified.
So what's the point in re-indexing?

Appreciate any insight.

Thanks


Would it be better to make my Schema changes within the renamed "/solr-5.3.0/server/solr/configsets/data_driven_schema_configs/conf/schema.xml" instead of the way that I am doing it now via curl -X PO

2016-03-19 Thread John Mitchell
I noticed that within
"/solr-5.3.0/server/solr/configsets/data_driven_schema_configs/conf" it has
a file called "managed-schema" and within this file it says "This is the
Solr schema file. This file should be named "schema.xml" and should be in
the conf directory".  Currently I have not renamed this file to
"schema.xml" and any adjustments to the Schema I have done via "curl -X
POST -H 'Content-type:application/json' --data-binary '{
"add-field":       ".

My question would it be better to make my Schema changes within the renamed
"/solr-5.3.0/server/solr/configsets/data_driven_schema_configs/conf/schema.xml"
instead of the way that I am doing it now via curl -X POST -H
'Content-type:application/json' --data-binary '{
"add-field":   "?

I have pasted below my shell script which starts with an empty Solr, then
adds to the Schema via "curl -X POST -H 'Content-type:application/json'
--data-binary '{ "add-field":   ", and then
runs Norconex-Collector-Http webcrawler which commits into Solr.

Thanks,

John Mitchell



#!/bin/bash

cd /home/jmitchell/20150905/solr-5.3.0

bin/solr stop -all ; rm -Rf example/cloud/

bin/solr start -e cloud -noprompt

# I am using a dynamic schema so I added the "content" field with the
type of "text_general" (see below) before starting to load any data into
Apache Solr and now everything correctly loads into Apache Solr. For now
not using the default "_text_" field as a replacement for the "content"
field.

curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field":{ "name":"content", "type":"text_general" } }'
http://localhost:8983/solr/gettingstarted/schema

# Adding the "Institutions_of_Higher_Education" field with a type of
strings:

curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field":{ "name":"Institutions_of_Higher_Education", "type":"strings" }
}' http://localhost:8983/solr/gettingstarted/schema

curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field":{ "name":"Local_Education_Agencies", "type":"strings" } }'
http://localhost:8983/solr/gettingstarted/schema

curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field":{ "name":"Nonprofit_Organizations", "type":"strings" } }'
http://localhost:8983/solr/gettingstarted/schema

curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field":{ "name":"Other_Organizations_and_or_Agencies",
"type":"strings" } }' http://localhost:8983/solr/gettingstarted/schema

curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field":{ "name":"State_Education_Agencies", "type":"strings" } }'
http://localhost:8983/solr/gettingstarted/schema

cd ..

chmod -R 777 solr-5.3.0

cd norconex-collector-http-2.2.1

rm -rf committer-queue/ examples-output/

#curl "
http://localhost:8983/solr/gettingstarted/select?q=*:*=json=true=10
"

/home/jmitchell/20150905/norconex-collector-http-2.2.1/collector-http.sh -a
start -c examples/minimum/minimum-config.xml 2>&1 >
/home/jmitchell/collector-http-solr-gettingstarted-changed_content_structure_adding_new_faceted_fields_3.txt
&




Re: Would it be better to make my Schema changes within the renamed "/solr-5.3.0/server/solr/configsets/data_driven_schema_configs/conf/schema.xml" instead of the way that I am doing it now via curl -

2016-03-19 Thread Shawn Heisey
On 3/18/2016 7:31 AM, John Mitchell wrote:
> My question would it be better to make my Schema changes within the renamed
> "/solr-5.3.0/server/solr/configsets/data_driven_schema_configs/conf/schema.xml"
> instead of the way that I am doing it now via curl -X POST -H
> 'Content-type:application/json' --data-binary '{
> "add-field":   "?

Only you can answer this question.  Do you want to continue using the
http API, or do you want to hand-edit the schema and reload the
core/collection?  Both are possible, but you should pick one or the other.

Note that you would not change the file in data_driven_schema_configs --
this is not the active configuration for a running index, it's a config
source for newly created indexes.  If you're not in cloud mode, you
would change the config in your core's conf directory.  If you're
running SolrCloud (Solr with zookeeper), the changes would need to be
made in zookeeper, most likely with the upconfig command on the zkcli
script.  After the change, the core/collection needs a reload.

Thanks,
Shawn



Re: What exactly happens to extant documents when the schema changes?

2013-05-30 Thread Dotan Cohen
On Wed, May 29, 2013 at 5:09 PM, Shawn Heisey s...@elyograg.org wrote:
 I handle this in a very specific way with my sharded index.  This won't
 work for all designs, and the precise procedure won't work for SolrCloud.

 There is a 'live' and a 'build' core for each of my shards.  When I want
 to reindex, the program makes a note of my current position for deletes,
 reinserts, and new documents.  Then I use a DIH full-import from mysql
 into the build cores.  Once the import is done, I run the update cycle
 of deletes, reinserts, and new documents on those build cores, using the
 position information noted earlier.  Then I swap the cores so the new
 index is online.


I do need to examine sharding and multiple cores. I'll look into that,
thank you. By the way, don't google for DIH! It took me some time to
figure out that it is DataImportHandler, as some people use the
acronym for something completely different.


 To adapt this for SolrCloud, I would need to use two collections, and
 update a collection alias for what is considered live.

 To control the I/O and CPU usage, you might need some kind of throttling
 in your update/rebuild application.

 I don't need any throttling in my design.  Because I'm using DIH, the
 import only uses a single thread for each shard on the server.  I've got
 RAID10 for storage and half of the CPU cores are still available for
 queries, so it doesn't overwhelm the server.

 The rebuild does lower performance, so I have the other copy of the
 index handle queries while the rebuild is underway.  When the rebuild is
 done on one copy, I run it again on the other copy.  Right now I'm
 half-upgraded -- one copy of my index is version 3.5.0, the other is
 4.2.1.  Switching to SolrCloud with sharding and replication would
 eliminate this flexibility, unless I maintained two separate clouds.


Thank you. I am not using Solr Cloud but if I ever consider it, then I
will keep this in mind.

--
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: What exactly happens to extant documents when the schema changes?

2013-05-29 Thread Dotan Cohen
On Tue, May 28, 2013 at 2:20 PM, Upayavira u...@odoko.co.uk wrote:
 The schema provides Solr with a description of what it will find in the
 Lucene indexes. If you, for example, changed a string field to an
 integer in your schema, that'd mess things up bigtime. I recently had to
 upgrade a date field from the 1.4.1 date field format to the newer
 TrieDateField. Given I had to do it on a live index, I had to add a new
 field (just using copyfield) and re-index over the top, as the old field
 was still in use. I guess, given my app now uses the new date field
 only, I could presumably reindex the old date field with the new
 TrieDateField format, but I'd want to try that before I do it for real.


Thank you for the insight. Unfortunately, with 20 million records and
growing by hundreds each minute (social media posts) I don't see that
I could ever reindex the data in a timely way.


 However, if you changed a single valued field to a multi-valued one,
 that's not an issue, as a field with a single value is still valid for a
 multi-valued field.

 Also, if you add a new field, existing documents will be considered to
 have no value in that field. If that is acceptable, then you're fine.

 I guess if you remove a field, then those fields will be ignored by
 Solr, and thus not impact anything. But I have to say, I've never tried
 that.

 Thus - changing the schema will only impact on future indexing. Whether
 your existing index will still be valid depends upon the changes you are
 making.

 Upayavira

Thanks.

--
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: What exactly happens to extant documents when the schema changes?

2013-05-29 Thread Dotan Cohen
On Tue, May 28, 2013 at 3:58 PM, Jack Krupansky j...@basetechnology.com wrote:
 The technical answer: Undefined and not guaranteed.


I was afraid of that!

 Sure, you can experiment and see what the effects happen to be in any
 given release, and maybe they don't tend to change (too much) between most
 releases, but there is no guarantee that any given change schema but keep
 existing data without a delete of directory contents and full reindex will
 actually be benign or what you expect.

 As a general proposition, when it comes to changing the schema and not
 deleting the directory and doing a full reindex, don't do it! Of course, we
 all know not to try to walk on thin ice, but a lot of people will try to do
 it anyway - and maybe it happens that most of the time the results are
 benign.


In the case of this particular application, reindexing really is
overly burdensome as the application is performing hundreds of writes
to the index per minute. How might I gauge how much spare I/O Solr
could commit to a reindex? All the data that I need is in fact in
stored fields.

Note that because the social media application that feeds our Solr
index is global, there are no 'off hours'.


 OTOH, you could file a Jira to propose that the effects of changing the
 schema but keeping the existing data should be precisely defined and
 documented, but, that could still change from release to release.


Seems like a lot of effort to document, for little benefit. I'm not
going to file it. I would like to know, though, is the schema
consulted at index time, query time, or both?


 From a practical perspective for your original question: If you suddenly add
 a field, there is no guarantee what will happen when you try to access that
 field for existing documents, or what will happen if you update existing
 documents. Sure, people can talk about what happens to be true today, but
 there is no guarantee for the future. Similarly for deleting a field from
 the schema, there is no guarantee about the status of existing data, even
 though people can chatter about what it seems to do today.

 Generally, you should design your application around contracts and what is
 guaranteed to be true, not what happens to be true from experiments or even
 experience. Granted, that is the theory and sometimes you do need to rely on
 experimentation and folklore and spotty or ambiguous documentation, but to
 the extent possible, it is best to avoid explicitly trying to rely on
 undocumented, uncontracted behavior.


Thanks. The application does change (added features) and we do not
want to loose old data.


 One question I asked long ago and never received an answer: what is the best
 practice for doing a full reindex - is it sufficient to first do a delete of
 *:*, or does the Solr index directory contents or even the directory
 itself need to be explicitly deleted first? I believe it is the latter, but
 the former seems to work, most of the time. Deleting the directory itself
 seems to be the best answer, to date - but no guarantees!


I don't have an answer for that, sorry!

--
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: What exactly happens to extant documents when the schema changes?

2013-05-29 Thread Shawn Heisey
On 5/29/2013 1:07 AM, Dotan Cohen wrote:
 In the case of this particular application, reindexing really is
 overly burdensome as the application is performing hundreds of writes
 to the index per minute. How might I gauge how much spare I/O Solr
 could commit to a reindex? All the data that I need is in fact in
 stored fields.
 
 Note that because the social media application that feeds our Solr
 index is global, there are no 'off hours'.

I handle this in a very specific way with my sharded index.  This won't
work for all designs, and the precise procedure won't work for SolrCloud.

There is a 'live' and a 'build' core for each of my shards.  When I want
to reindex, the program makes a note of my current position for deletes,
reinserts, and new documents.  Then I use a DIH full-import from mysql
into the build cores.  Once the import is done, I run the update cycle
of deletes, reinserts, and new documents on those build cores, using the
position information noted earlier.  Then I swap the cores so the new
index is online.

To adapt this for SolrCloud, I would need to use two collections, and
update a collection alias for what is considered live.

To control the I/O and CPU usage, you might need some kind of throttling
in your update/rebuild application.

I don't need any throttling in my design.  Because I'm using DIH, the
import only uses a single thread for each shard on the server.  I've got
RAID10 for storage and half of the CPU cores are still available for
queries, so it doesn't overwhelm the server.

The rebuild does lower performance, so I have the other copy of the
index handle queries while the rebuild is underway.  When the rebuild is
done on one copy, I run it again on the other copy.  Right now I'm
half-upgraded -- one copy of my index is version 3.5.0, the other is
4.2.1.  Switching to SolrCloud with sharding and replication would
eliminate this flexibility, unless I maintained two separate clouds.

Thanks,
Shawn



What exactly happens to extant documents when the schema changes?

2013-05-28 Thread Dotan Cohen
When adding or removing a text field to/from the schema and then
restarting Solr, what exactly happens to extant documents? Is the
schema only consulted when Solr writes a document, therefore extant
documents are unaffected?

Considering that Solr supports dynamic fields, my experimentation with
removing and adding fields to the schema has shown almost no change in
the extant index results returned.

--
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: What exactly happens to extant documents when the schema changes?

2013-05-28 Thread Upayavira


On Tue, May 28, 2013, at 10:21 AM, Dotan Cohen wrote:
 When adding or removing a text field to/from the schema and then
 restarting Solr, what exactly happens to extant documents? Is the
 schema only consulted when Solr writes a document, therefore extant
 documents are unaffected?
 
 Considering that Solr supports dynamic fields, my experimentation with
 removing and adding fields to the schema has shown almost no change in
 the extant index results returned.

The schema provides Solr with a description of what it will find in the
Lucene indexes. If you, for example, changed a string field to an
integer in your schema, that'd mess things up bigtime. I recently had to
upgrade a date field from the 1.4.1 date field format to the newer
TrieDateField. Given I had to do it on a live index, I had to add a new
field (just using copyfield) and re-index over the top, as the old field
was still in use. I guess, given my app now uses the new date field
only, I could presumably reindex the old date field with the new
TrieDateField format, but I'd want to try that before I do it for real.

However, if you changed a single valued field to a multi-valued one,
that's not an issue, as a field with a single value is still valid for a
multi-valued field.

Also, if you add a new field, existing documents will be considered to
have no value in that field. If that is acceptable, then you're fine.

I guess if you remove a field, then those fields will be ignored by
Solr, and thus not impact anything. But I have to say, I've never tried
that.

Thus - changing the schema will only impact on future indexing. Whether
your existing index will still be valid depends upon the changes you are
making.

Upayavira


Re: What exactly happens to extant documents when the schema changes?

2013-05-28 Thread Jack Krupansky

The technical answer: Undefined and not guaranteed.

Sure, you can experiment and see what the effects happen to be in any 
given release, and maybe they don't tend to change (too much) between most 
releases, but there is no guarantee that any given change schema but keep 
existing data without a delete of directory contents and full reindex will 
actually be benign or what you expect.


As a general proposition, when it comes to changing the schema and not 
deleting the directory and doing a full reindex, don't do it! Of course, we 
all know not to try to walk on thin ice, but a lot of people will try to do 
it anyway - and maybe it happens that most of the time the results are 
benign.


OTOH, you could file a Jira to propose that the effects of changing the 
schema but keeping the existing data should be precisely defined and 
documented, but, that could still change from release to release.


From a practical perspective for your original question: If you suddenly add 
a field, there is no guarantee what will happen when you try to access that 
field for existing documents, or what will happen if you update existing 
documents. Sure, people can talk about what happens to be true today, but 
there is no guarantee for the future. Similarly for deleting a field from 
the schema, there is no guarantee about the status of existing data, even 
though people can chatter about what it seems to do today.


Generally, you should design your application around contracts and what is 
guaranteed to be true, not what happens to be true from experiments or even 
experience. Granted, that is the theory and sometimes you do need to rely on 
experimentation and folklore and spotty or ambiguous documentation, but to 
the extent possible, it is best to avoid explicitly trying to rely on 
undocumented, uncontracted behavior.


One question I asked long ago and never received an answer: what is the best 
practice for doing a full reindex - is it sufficient to first do a delete of 
*:*, or does the Solr index directory contents or even the directory 
itself need to be explicitly deleted first? I believe it is the latter, but 
the former seems to work, most of the time. Deleting the directory itself 
seems to be the best answer, to date - but no guarantees!



-- Jack Krupansky

-Original Message- 
From: Dotan Cohen

Sent: Tuesday, May 28, 2013 5:21 AM
To: solr-user@lucene.apache.org
Subject: What exactly happens to extant documents when the schema changes?

When adding or removing a text field to/from the schema and then
restarting Solr, what exactly happens to extant documents? Is the
schema only consulted when Solr writes a document, therefore extant
documents are unaffected?

Considering that Solr supports dynamic fields, my experimentation with
removing and adding fields to the schema has shown almost no change in
the extant index results returned.

--
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com 



Re: schema changes changes 3.3 to 3.4?

2011-10-05 Thread jo
Okay I did use the analysis tool and it did make me notice a few things but
more important what changed

there is no longer a field type named text on the new schema, there is
only text_en  which is weird as text field is the default when doing a
query.. 
anyway, when I used the analysis tool and made the steamers and all match
between the old schema and the new schema I get a result in the analysis
tool but not in the query.

I have to say that I have been using Solr with the default schema without
any changes in the past, but since I upgraded to 3.4.0 I have this problem
with the plurals not been displayed.


--
View this message in context: 
http://lucene.472066.n3.nabble.com/schema-changes-changes-3-3-to-3-4-tp3391019p3396737.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: schema changes changes 3.3 to 3.4?

2011-10-04 Thread Erick Erickson
It looks to me like you changed the analysis
chain for the field in question by removing
stemmers of some sort or other. The quickest
way to answer this kind of question is to get
familiar with the admin/analysis page (don't
forget to check the verbose checkboxes). Enter
the term in both the index and query boxes and
hit the button. It shows you exactly what parts
of the chain performed what actions.

So you index analysis chain probably removed
the plurals, but your query-side didn't. So I'm
guessing that not only didn't it show the
metadata, it didn't even find the document in
question.

But that's a guess at this point.

Best
Erick

On Mon, Oct 3, 2011 at 4:22 PM, jo jairo.or...@firmex.com wrote:
 Hi, I have the following issue on my test environment
 when I do a query with the full word the reply no longer contains the
 attr_meta
 ex:
 http://solr1:8983/solr/core_1/select/?q=stegosaurus

 arr name=attr_content_encoding
 strISO-8859-1/str
 /arr
 arr name=attr_content_language
 stren/str
 /arr

 but if I remove just one letter it shows the expected response
 ex:
 http://solr1:8983/solr/core_1/select/?q=stegosauru

 arr name=attr_content_encoding
 strISO-8859-1/str
 /arr
 arr name=attr_meta
 strstream_source_info/str
 strdocument/str
 strstream_content_type/str
 strtext/plain/str
 strstream_size/str
 str81/str
 strContent-Encoding/str
 strISO-8859-1/str
 strstream_name/str
 strfilex123.txt/str
 strContent-Type/str
 strtext/plain/str
 strresourceName/str
 strdinosaurs5.txt/str
 /arr


 For troubleshooting I replaced the schema.xml from 3.3 into 3.4 and it work
 just fine, I can't find what changes on the schema would case this, any
 clues?

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/schema-changes-changes-3-3-to-3-4-tp3391019p3391019.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: schema changes changes 3.3 to 3.4?

2011-10-04 Thread jo
Interesting... I did not make changes on the default settings, but defenetely
will give that a shot.. thanks I will comment later if I found a
solution beside replacing the schema with the default one on 3.3

thanks

JO

--
View this message in context: 
http://lucene.472066.n3.nabble.com/schema-changes-changes-3-3-to-3-4-tp3391019p3395265.html
Sent from the Solr - User mailing list archive at Nabble.com.


schema changes changes 3.3 to 3.4?

2011-10-03 Thread jo
Hi, I have the following issue on my test environment
when I do a query with the full word the reply no longer contains the
attr_meta
ex: 
http://solr1:8983/solr/core_1/select/?q=stegosaurus

arr name=attr_content_encoding
strISO-8859-1/str
/arr
arr name=attr_content_language
stren/str
/arr

but if I remove just one letter it shows the expected response
ex:
http://solr1:8983/solr/core_1/select/?q=stegosauru

arr name=attr_content_encoding
strISO-8859-1/str
/arr
arr name=attr_meta
strstream_source_info/str
strdocument/str
strstream_content_type/str
strtext/plain/str
strstream_size/str
str81/str
strContent-Encoding/str
strISO-8859-1/str
strstream_name/str
strfilex123.txt/str
strContent-Type/str
strtext/plain/str
strresourceName/str
strdinosaurs5.txt/str
/arr


For troubleshooting I replaced the schema.xml from 3.3 into 3.4 and it work
just fine, I can't find what changes on the schema would case this, any
clues?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/schema-changes-changes-3-3-to-3-4-tp3391019p3391019.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Server Restart Required for Schema Changes After Document Delete All?

2011-06-27 Thread Tomás Fernández Löbbe
This should work with dynamic fields too. Are you having any problems with
it?


On Thu, Jun 23, 2011 at 3:14 PM, Brandon Fish brandon.j.f...@gmail.comwrote:

 Are there any schema changes that would cause problems with the following
 procedure from the
 FAQ
 http://wiki.apache.org/solr/FAQ#How_can_I_rebuild_my_index_from_scratch_if_I_change_my_schema.3F
 
 ?

 1.Use the match all docs query in a delete by query command before
 shutting down Solr: deletequery*:*/query/delete

   1. Reload core
   2. Re-Index your data

 Would this work when dynamic fields are removed?



Re: Server Restart Required for Schema Changes After Document Delete All?

2011-06-27 Thread Tomás Fernández Löbbe
One more thing, it's not necessary to restart the server, just to reload the
core: http://wiki.apache.org/solr/CoreAdmin#RELOAD

2011/6/27 Tomás Fernández Löbbe tomasflo...@gmail.com

 This should work with dynamic fields too. Are you having any problems with
 it?


 On Thu, Jun 23, 2011 at 3:14 PM, Brandon Fish brandon.j.f...@gmail.comwrote:

 Are there any schema changes that would cause problems with the following
 procedure from the
 FAQ
 http://wiki.apache.org/solr/FAQ#How_can_I_rebuild_my_index_from_scratch_if_I_change_my_schema.3F
 
 ?

 1.Use the match all docs query in a delete by query command before
 shutting down Solr: deletequery*:*/query/delete

   1. Reload core
   2. Re-Index your data

 Would this work when dynamic fields are removed?





Re: Server Restart Required for Schema Changes After Document Delete All?

2011-06-27 Thread Brandon Fish
I'm not having any issues. I was just asking to see if any backward
incompatible changes exist that would require a server restart. Thanks.

2011/6/27 Tomás Fernández Löbbe tomasflo...@gmail.com

 This should work with dynamic fields too. Are you having any problems with
 it?


 On Thu, Jun 23, 2011 at 3:14 PM, Brandon Fish brandon.j.f...@gmail.com
 wrote:

  Are there any schema changes that would cause problems with the following
  procedure from the
  FAQ
 
 http://wiki.apache.org/solr/FAQ#How_can_I_rebuild_my_index_from_scratch_if_I_change_my_schema.3F
  
  ?
 
  1.Use the match all docs query in a delete by query command before
  shutting down Solr: deletequery*:*/query/delete
 
1. Reload core
2. Re-Index your data
 
  Would this work when dynamic fields are removed?
 



Server Restart Required for Schema Changes After Document Delete All?

2011-06-23 Thread Brandon Fish
Are there any schema changes that would cause problems with the following
procedure from the
FAQhttp://wiki.apache.org/solr/FAQ#How_can_I_rebuild_my_index_from_scratch_if_I_change_my_schema.3F
?

1.Use the match all docs query in a delete by query command before
shutting down Solr: deletequery*:*/query/delete

   1. Reload core
   2. Re-Index your data

Would this work when dynamic fields are removed?


Re: Which schema changes are incompatible?

2010-02-10 Thread Chris Hostetter

: 
http://wiki.apache.org/solr/FAQ#How_can_I_rebuild_my_index_from_scratch_if_I_change_my_schema.3F
: 
: but it is not clear about the times when this is needed. So I wonder, do I
: need to do it after adding a field, removing a field, changing field type,
: changing indexed/stored/multiValue properties? What happens if I don't do
: it, will Solr die?

there is no simple answer to that question ... if you add a field you 
don't need to rebuild (unless you want to ensure every doc gets a value 
indexed or if you are depending on solr to apply a default value).  If you 
remove a field you don't need to rebuild (but none of hte space taken up 
by that field in the index will be reclaimed, and if it's stored it will 
still be included in the response. 

Changing a field type is one of the few sitautions where we can 
categoricly say you *HAVE* to reindex everything

: Also, the FAQ entry notes that one can delete all documents, change the
: schema.xml file, and then reload the core. Would it be possible to instead
: change schema.xml, reload the core, and then rebuild the index -- in effect
: slowly deleting the old documents, but never ending up with a completely
: empty index? I realize that some weird search results could happen during
: such a rebuild, but that may be preferable to having no search results at

The end result won't be 100% equivilent from an index standpoint -- whne 
you delete all solr is actaully able to completely start over with an 
empty index, absent all low level metadata about fields that use to exist 
-- if you incrementally delete, some of that low level metadata will still 
be in the index -- it probably won't be something that will ever affect 
you, but it is a distinction.



-Hoss



Which schema changes are incompatible?

2010-01-28 Thread Anders Melchiorsen
Hello.

I read the FAQ entry about rebuilding the index,

 
http://wiki.apache.org/solr/FAQ#How_can_I_rebuild_my_index_from_scratch_if_I_change_my_schema.3F

but it is not clear about the times when this is needed. So I wonder, do I
need to do it after adding a field, removing a field, changing field type,
changing indexed/stored/multiValue properties? What happens if I don't do
it, will Solr die?

Also, the FAQ entry notes that one can delete all documents, change the
schema.xml file, and then reload the core. Would it be possible to instead
change schema.xml, reload the core, and then rebuild the index -- in effect
slowly deleting the old documents, but never ending up with a completely
empty index? I realize that some weird search results could happen during
such a rebuild, but that may be preferable to having no search results at
all.

(I also realize that we need more Solr servers, to be able to do these
updates without taking down the search service. But, currently we have just
one)


Thanks,
Anders.