Re: Composite key for uniqueKeyId
On Fri, 7 Mar 2008 17:59:48 -0800 (PST) Chris Hostetter [EMAIL PROTECTED] wrote: I believe Norberto ment he was handling it in his update client code -- before sending the docs to Solr. Indeed, this what we do. We have a process that parses certain files, generates documents following the SOLR schema in use and publishes them to the index. This process is the one that generates the DocID based on other fields. Something that *seems* possible but I've never actaully tried is writting a ConcatTokenFilterFactory that queues up all the tokens and joins them together (using some confiured string, defaulting to ) then you could in theory do something like this... yeah, i never tried this because we need somewhat more complex calculations to be done for DocId. cheers, B _ {Beto|Norberto|Numard} Meijome It's not what you do, it's the love you put into it. Mother Theresa. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Composite key for uniqueKeyId
The best thing folks can do to help with getting patches like this important DataImporterHandler committed to trunk is to try it out, report back experiences, and offer suggestions for improvement. Solr 1.3 will come in _good_ time, but not before its time. There are many substantial changes in Solr between 1.2 and trunk and some more slated. Knocking out any of these gets us closer to the release as well: http://issues.apache.org/jira/secure/IssueNavigator.jspa?sorter/ field=statussorter/order=DESC Erik On Mar 8, 2008, at 2:48 AM, Vijay Rao wrote: I am also looking forward to get this checked into the trunk. Will there be a patch with Solr1.2 support? Cheers Vijay On Sat, Mar 8, 2008 at 10:11 AM, Jon Baer [EMAIL PROTECTED] wrote: That definitely sounds like the proper way to go + will try. Im not too concerned w/ my keys coming back just that I can't seem to run the DataImportHandler w/o one. I was able to temporarily get around it by returning it in the entity query. Ie: entity query=select concat(col1,col2,col3,col4) as id field name=id column=id / /entity BTW, the DataImportHandler seems to still be a patch, is there an estimation of if/when it will appear in trunk? Thanks! - Jon On Mar 7, 2008, at 8:59 PM, Chris Hostetter wrote: I believe Norberto ment he was handling it in his update client code -- before sending the docs to Solr. Something that *seems* possible but I've never actaully tried is writting a ConcatTokenFilterFactory that queues up all the tokens and joins them together (using some confiured string, defaulting to ) then you could in theory do something like this... fieldType name=compositeKeyType class=solr.TextField omitNorms=true analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.ConcatTokenFilterFactory delim=-/ analyzer /fieldType ... field name=compositeKey type=compositeKeyType / uniqueKeycompositeKey/uniqueKey ... copyField source=type dest=compositeKey/ copyField source=numId dest=compositeKey/ ... that *might* work ... but things would be a little weird when viewing your results (compositeKey would have to be multivalued, and it would return as an array) -Hoss
Re: Composite key for uniqueKeyId
hi , The tool is undergoing substantial testing in our QA department . Because it is an official internal project also, the bugs are filed in our bug tool. We are fixing them as and when they are reported. It has gone through some good iterations and it is going to power the backend for a 2 of our products which are going to come out in a month's time. (More in the pipeline). Internally it has already had a 1.0 release. The next patch is going to contain the 1.0 release + a few extra features. We are testing with a dataset of ~3 million documents . Each document is built by joining around 6 tables. This is not to say that it is free of bugs. Please do the testing and report back any bugs and we will be glad to incorporate the fixes in the next patch. --Noble On Sat, Mar 8, 2008 at 4:10 PM, Erik Hatcher [EMAIL PROTECTED] wrote: The best thing folks can do to help with getting patches like this important DataImporterHandler committed to trunk is to try it out, report back experiences, and offer suggestions for improvement. Solr 1.3 will come in _good_ time, but not before its time. There are many substantial changes in Solr between 1.2 and trunk and some more slated. Knocking out any of these gets us closer to the release as well: http://issues.apache.org/jira/secure/IssueNavigator.jspa?sorter/ field=statussorter/order=DESC Erik On Mar 8, 2008, at 2:48 AM, Vijay Rao wrote: I am also looking forward to get this checked into the trunk. Will there be a patch with Solr1.2 support? Cheers Vijay On Sat, Mar 8, 2008 at 10:11 AM, Jon Baer [EMAIL PROTECTED] wrote: That definitely sounds like the proper way to go + will try. Im not too concerned w/ my keys coming back just that I can't seem to run the DataImportHandler w/o one. I was able to temporarily get around it by returning it in the entity query. Ie: entity query=select concat(col1,col2,col3,col4) as id field name=id column=id / /entity BTW, the DataImportHandler seems to still be a patch, is there an estimation of if/when it will appear in trunk? Thanks! - Jon On Mar 7, 2008, at 8:59 PM, Chris Hostetter wrote: I believe Norberto ment he was handling it in his update client code -- before sending the docs to Solr. Something that *seems* possible but I've never actaully tried is writting a ConcatTokenFilterFactory that queues up all the tokens and joins them together (using some confiured string, defaulting to ) then you could in theory do something like this... fieldType name=compositeKeyType class=solr.TextField omitNorms=true analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.ConcatTokenFilterFactory delim=-/ analyzer /fieldType ... field name=compositeKey type=compositeKeyType / uniqueKeycompositeKey/uniqueKey ... copyField source=type dest=compositeKey/ copyField source=numId dest=compositeKey/ ... that *might* work ... but things would be a little weird when viewing your results (compositeKey would have to be multivalued, and it would return as an array) -Hoss -- --Noble Paul
Re: Composite key for uniqueKeyId
I used it to index my DB and I found no bugs .Ours is a very simple usecase. There were rough edges though. The error logging and messages were not up to the mark. It aborted the entire indexing when there was a missing 'required' field. It must just skip that document. Or give me an opotion to configure that. I am waiting for the next patch to file the bugs. It gives me some confidence to know that this tool is powering AOL's infrastructure. Cheers Vijay On Sat, Mar 8, 2008 at 6:48 PM, Noble Paul നോബിള് नोब्ळ् [EMAIL PROTECTED] wrote: hi , The tool is undergoing substantial testing in our QA department . Because it is an official internal project also, the bugs are filed in our bug tool. We are fixing them as and when they are reported. It has gone through some good iterations and it is going to power the backend for a 2 of our products which are going to come out in a month's time. (More in the pipeline). Internally it has already had a 1.0 release. The next patch is going to contain the 1.0 release + a few extra features. We are testing with a dataset of ~3 million documents . Each document is built by joining around 6 tables. This is not to say that it is free of bugs. Please do the testing and report back any bugs and we will be glad to incorporate the fixes in the next patch. --Noble On Sat, Mar 8, 2008 at 4:10 PM, Erik Hatcher [EMAIL PROTECTED] wrote: The best thing folks can do to help with getting patches like this important DataImporterHandler committed to trunk is to try it out, report back experiences, and offer suggestions for improvement. Solr 1.3 will come in _good_ time, but not before its time. There are many substantial changes in Solr between 1.2 and trunk and some more slated. Knocking out any of these gets us closer to the release as well: http://issues.apache.org/jira/secure/IssueNavigator.jspa?sorter/ field=statussorter/order=DESC Erik On Mar 8, 2008, at 2:48 AM, Vijay Rao wrote: I am also looking forward to get this checked into the trunk. Will there be a patch with Solr1.2 support? Cheers Vijay On Sat, Mar 8, 2008 at 10:11 AM, Jon Baer [EMAIL PROTECTED] wrote: That definitely sounds like the proper way to go + will try. Im not too concerned w/ my keys coming back just that I can't seem to run the DataImportHandler w/o one. I was able to temporarily get around it by returning it in the entity query. Ie: entity query=select concat(col1,col2,col3,col4) as id field name=id column=id / /entity BTW, the DataImportHandler seems to still be a patch, is there an estimation of if/when it will appear in trunk? Thanks! - Jon On Mar 7, 2008, at 8:59 PM, Chris Hostetter wrote: I believe Norberto ment he was handling it in his update client code -- before sending the docs to Solr. Something that *seems* possible but I've never actaully tried is writting a ConcatTokenFilterFactory that queues up all the tokens and joins them together (using some confiured string, defaulting to ) then you could in theory do something like this... fieldType name=compositeKeyType class=solr.TextField omitNorms=true analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.ConcatTokenFilterFactory delim=-/ analyzer /fieldType ... field name=compositeKey type=compositeKeyType / uniqueKeycompositeKey/uniqueKey ... copyField source=type dest=compositeKey/ copyField source=numId dest=compositeKey/ ... that *might* work ... but things would be a little weird when viewing your results (compositeKey would have to be multivalued, and it would return as an array) -Hoss -- --Noble Paul
Re: Composite key for uniqueKeyId
Hi Norberto, This sounds exactly what Im looking to do, do you have an example? (Keep in mind Im using data-config.xml - DataImporter) Im interested in merging different types of content in, ie: NEWS12345 VIDEO12345 So Id like to end up w/ different keys per type if possible. Thanks. - Jon On Mar 6, 2008, at 11:21 PM, Norberto Meijome wrote: On Thu, 6 Mar 2008 11:33:38 -0500 Jon Baer [EMAIL PROTECTED] wrote: Im interested to know if composite keys are now possible or if there is anything to copyField I can use to get composite keys working for my doc ids? FWIW, we just do this @ doc generation time - grab several fields, massage them into shape, normalise, assign to docID B _ {Beto|Norberto|Numard} Meijome ...using the internet as it was originally intended... for the further research of pornography and pipebombs. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Re: Composite key for uniqueKeyId
That definitely sounds like the proper way to go + will try. Im not too concerned w/ my keys coming back just that I can't seem to run the DataImportHandler w/o one. I was able to temporarily get around it by returning it in the entity query. Ie: entity query=select concat(col1,col2,col3,col4) as id field name=id column=id / /entity BTW, the DataImportHandler seems to still be a patch, is there an estimation of if/when it will appear in trunk? Thanks! - Jon On Mar 7, 2008, at 8:59 PM, Chris Hostetter wrote: I believe Norberto ment he was handling it in his update client code -- before sending the docs to Solr. Something that *seems* possible but I've never actaully tried is writting a ConcatTokenFilterFactory that queues up all the tokens and joins them together (using some confiured string, defaulting to ) then you could in theory do something like this... fieldType name=compositeKeyType class=solr.TextField omitNorms=true analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.ConcatTokenFilterFactory delim=-/ analyzer /fieldType ... field name=compositeKey type=compositeKeyType / uniqueKeycompositeKey/uniqueKey ... copyField source=type dest=compositeKey/ copyField source=numId dest=compositeKey/ ... that *might* work ... but things would be a little weird when viewing your results (compositeKey would have to be multivalued, and it would return as an array) -Hoss
Re: Composite key for uniqueKeyId
Good to hear that people are using DatImportHandler In a couple of days, we are giving another patch which is cleared by our QA with better error handling, messaging and a lot of new features. A committer will have to decide on when it is good enough to be committed --Noble On Sat, Mar 8, 2008 at 10:11 AM, Jon Baer [EMAIL PROTECTED] wrote: That definitely sounds like the proper way to go + will try. Im not too concerned w/ my keys coming back just that I can't seem to run the DataImportHandler w/o one. I was able to temporarily get around it by returning it in the entity query. Ie: entity query=select concat(col1,col2,col3,col4) as id field name=id column=id / /entity BTW, the DataImportHandler seems to still be a patch, is there an estimation of if/when it will appear in trunk? Thanks! - Jon On Mar 7, 2008, at 8:59 PM, Chris Hostetter wrote: I believe Norberto ment he was handling it in his update client code -- before sending the docs to Solr. Something that *seems* possible but I've never actaully tried is writting a ConcatTokenFilterFactory that queues up all the tokens and joins them together (using some confiured string, defaulting to ) then you could in theory do something like this... fieldType name=compositeKeyType class=solr.TextField omitNorms=true analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.ConcatTokenFilterFactory delim=-/ analyzer /fieldType ... field name=compositeKey type=compositeKeyType / uniqueKeycompositeKey/uniqueKey ... copyField source=type dest=compositeKey/ copyField source=numId dest=compositeKey/ ... that *might* work ... but things would be a little weird when viewing your results (compositeKey would have to be multivalued, and it would return as an array) -Hoss -- --Noble Paul
Re: Composite key for uniqueKeyId
I am also looking forward to get this checked into the trunk. Will there be a patch with Solr1.2 support? Cheers Vijay On Sat, Mar 8, 2008 at 10:11 AM, Jon Baer [EMAIL PROTECTED] wrote: That definitely sounds like the proper way to go + will try. Im not too concerned w/ my keys coming back just that I can't seem to run the DataImportHandler w/o one. I was able to temporarily get around it by returning it in the entity query. Ie: entity query=select concat(col1,col2,col3,col4) as id field name=id column=id / /entity BTW, the DataImportHandler seems to still be a patch, is there an estimation of if/when it will appear in trunk? Thanks! - Jon On Mar 7, 2008, at 8:59 PM, Chris Hostetter wrote: I believe Norberto ment he was handling it in his update client code -- before sending the docs to Solr. Something that *seems* possible but I've never actaully tried is writting a ConcatTokenFilterFactory that queues up all the tokens and joins them together (using some confiured string, defaulting to ) then you could in theory do something like this... fieldType name=compositeKeyType class=solr.TextField omitNorms=true analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.ConcatTokenFilterFactory delim=-/ analyzer /fieldType ... field name=compositeKey type=compositeKeyType / uniqueKeycompositeKey/uniqueKey ... copyField source=type dest=compositeKey/ copyField source=numId dest=compositeKey/ ... that *might* work ... but things would be a little weird when viewing your results (compositeKey would have to be multivalued, and it would return as an array) -Hoss
Re: Composite key for uniqueKeyId
On Thu, 6 Mar 2008 11:33:38 -0500 Jon Baer [EMAIL PROTECTED] wrote: Im interested to know if composite keys are now possible or if there is anything to copyField I can use to get composite keys working for my doc ids? FWIW, we just do this @ doc generation time - grab several fields, massage them into shape, normalise, assign to docID B _ {Beto|Norberto|Numard} Meijome ...using the internet as it was originally intended... for the further research of pornography and pipebombs. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.