OK so the way I understand this is that if there is a synonym on a specific field at index time, that value will be stored rather than the one in the csv that I am indexing? I will give it a whirl and report back...
Thanks! Adam On Sat, Dec 4, 2010 at 2:27 PM, Erick Erickson <erickerick...@gmail.com>wrote: > When you define your fieldType at index time. My idea > was that you substitue these on the way in to your > index. You may need a specific field type just for your > country conversion.... Perhaps in a copyField if > you need both the code and full name.... > > Best > Erick > > On Sat, Dec 4, 2010 at 12:16 PM, Adam Estrada < > estrada.adam.gro...@gmail.com > > wrote: > > > Synonyms eh? I have a synonym list like the following so how do I > identify > > the synonyms on a specific field. The only place the field is used is as > a > > facet. > > > > original field => country name > > > > AF => AFGHANISTAN > > AX => Ă…LAND ISLANDS > > AL => ALBANIA > > DZ => ALGERIA > > AS => AMERICAN SAMOA > > AD => ANDORRA > > AO => ANGOLA > > AI => ANGUILLA > > AQ => ANTARCTICA > > AG => ANTIGUA AND BARBUDA > > AR => ARGENTINA > > AM => ARMENIA > > AW => ARUBA > > AU => AUSTRALIA > > AT => AUSTRIA > > etc... > > > > Any advise on that would be great and very much appreciated! > > > > Adam > > > > On Fri, Dec 3, 2010 at 3:55 PM, Erick Erickson <erickerick...@gmail.com > > >wrote: > > > > > That will certainly work. Another option, assuming the country codes > are > > > in their own field would be to put the transformations into a synonym > > file > > > that was only used on that field. That way you'd get this without > having > > > to do the pre-process step of the raw data... > > > > > > That said, if you pre-processing is working for you it may not be > worth > > > your while > > > to worry about doing it differently > > > > > > Best > > > Erick > > > > > > On Fri, Dec 3, 2010 at 12:51 PM, Adam Estrada < > > > estrada.adam.gro...@gmail.com > > > > wrote: > > > > > > > First off...I know enough about Solr to be VERY dangerous so please > > bare > > > > with me ;-) I am indexing the geonames database which only provides > > > country > > > > codes. I can facet the codes but to the end user who may not know all > > 249 > > > > codes, it isn't really all that helpful. Therefore, I want to map the > > > full > > > > country names to the country codes provided in the geonames db. > > > > http://download.geonames.org/export/dump/ > > > > > > > > <http://download.geonames.org/export/dump/>I used a simple split > > > function > > > > to > > > > chop the 850 meg txt file in to manageable csv's that I can import in > > to > > > > Solr. Now that all 7 million + documents are in there, I want to > change > > > the > > > > country codes to the actual country names. I would of liked to have > > done > > > it > > > > in the index but finding and replacing the strings in the csv seems > to > > be > > > > working fine. After that I can just reindex the entire thing. > > > > > > > > Adam > > > > > > > > On Fri, Dec 3, 2010 at 12:42 PM, Erick Erickson < > > erickerick...@gmail.com > > > > >wrote: > > > > > > > > > Have you consider defining synonyms for your code <->country > > > > > conversion at index time (or query time for that matter)? > > > > > > > > > > We may have an XY problem here. Could you state the high-level > > > > > problem you're trying to solve? Maybe there's a better solution... > > > > > > > > > > Best > > > > > Erick > > > > > > > > > > On Fri, Dec 3, 2010 at 12:20 PM, Adam Estrada < > > > > > estrada.adam.gro...@gmail.com > > > > > > wrote: > > > > > > > > > > > I wonder...I know that sed would work to find and replace the > terms > > > in > > > > > all > > > > > > of the csv files that I am indexing but would it work to find and > > > > replace > > > > > > key terms in the index? > > > > > > > > > > > > find C:\\tmp\\index\\data -type f -exec sed -i > 's/AF/AFGHANISTAN/g' > > > {} > > > > \; > > > > > > > > > > > > That command would iterate through all the files in the data > > > directory > > > > > and > > > > > > replace the country code with the full country name. I many just > > back > > > > up > > > > > > the > > > > > > directory and try it. I have it running on csv files right now > and > > > it's > > > > > > working wonderfully. For those of you interested, I am indexing > the > > > > > entire > > > > > > Geonames dataset > > > > > http://download.geonames.org/export/dump/(allCountries.zip) > > > > > > which gives me a pretty comprehensive world gazetteer. My next > step > > > is > > > > > > gonna > > > > > > be to display the results as KML to view over a google globe. > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > Adam > > > > > > > > > > > > On Fri, Dec 3, 2010 at 7:57 AM, Erick Erickson < > > > > erickerick...@gmail.com > > > > > > >wrote: > > > > > > > > > > > > > No, there's no equivalent to SQL update for all values in a > > column. > > > > > > You'll > > > > > > > have to reindex all the documents. > > > > > > > > > > > > > > On Thu, Dec 2, 2010 at 10:52 PM, Adam Estrada < > > > > > > > estrada.adam.gro...@gmail.com > > > > > > > > wrote: > > > > > > > > > > > > > > > OK part 2 of my previous question... > > > > > > > > > > > > > > > > Is there a way to batch update field values based on a > certain > > > > > > criteria? > > > > > > > > For example, if thousands of documents have a field value of > > 'US' > > > > can > > > > > I > > > > > > > > update all of them to 'United States' programmatically? > > > > > > > > > > > > > > > > Adam > > > > > > > > > > > > > > > > > > > > > > > > > > > >