Re: Composite key for uniqueKeyId

2008-03-10 Thread Norberto Meijome
On Fri, 7 Mar 2008 17:59:48 -0800 (PST)
Chris Hostetter [EMAIL PROTECTED] wrote:

 I believe Norberto ment he was handling it in his update client code -- 
 before sending the docs to Solr.

Indeed, this what we do. We have a process that parses certain files, generates
documents following the SOLR schema in use and publishes them to the index.
This process is the one that generates the DocID based on other fields.

 
 Something that *seems* possible but I've never actaully tried is writting 
 a ConcatTokenFilterFactory that queues up all the tokens and joins 
 them together (using some confiured string, defaulting to )  then you 
 could in theory do something like this...

yeah, i never tried this because we need somewhat more complex calculations to
be done for DocId.

cheers,
B

_
{Beto|Norberto|Numard} Meijome

It's not what you do, it's the love you put into it.
   Mother Theresa.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Composite key for uniqueKeyId

2008-03-08 Thread Erik Hatcher
The best thing folks can do to help with getting patches like this  
important DataImporterHandler committed to trunk is to try it out,  
report back experiences, and offer suggestions for improvement.


Solr 1.3 will come in _good_ time, but not before its time.  There  
are many substantial changes in Solr between 1.2 and trunk and some  
more slated.  Knocking out any of these gets us closer to the release  
as well:


http://issues.apache.org/jira/secure/IssueNavigator.jspa?sorter/ 
field=statussorter/order=DESC


Erik



On Mar 8, 2008, at 2:48 AM, Vijay Rao wrote:

I am also looking forward to get this checked into the trunk.

Will there be a patch with Solr1.2 support?
Cheers
Vijay

On Sat, Mar 8, 2008 at 10:11 AM, Jon Baer [EMAIL PROTECTED] wrote:


That definitely sounds like the proper way to go + will try.  Im not
too concerned w/ my keys coming back just that I can't seem to run  
the

DataImportHandler w/o one.

I was able to temporarily get around it by returning it in the entity
query.  Ie:

entity query=select concat(col1,col2,col3,col4) as id
  field name=id column=id /
/entity

BTW, the DataImportHandler seems to still be a patch, is there an
estimation of if/when it will appear in trunk?

Thanks!

- Jon

On Mar 7, 2008, at 8:59 PM, Chris Hostetter wrote:



I believe Norberto ment he was handling it in his update client code
--
before sending the docs to Solr.

Something that *seems* possible but I've never actaully tried is
writting
a ConcatTokenFilterFactory that queues up all the tokens and joins
them together (using some confiured string, defaulting to )  then
you
could in theory do something like this...

   fieldType name=compositeKeyType class=solr.TextField
omitNorms=true
 analyzer
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.ConcatTokenFilterFactory delim=-/
 analyzer
   /fieldType
   ...
   field name=compositeKey type=compositeKeyType /
   uniqueKeycompositeKey/uniqueKey
   ...
   copyField source=type  dest=compositeKey/
   copyField source=numId dest=compositeKey/
   ...

that *might* work ... but things would be a little weird when
viewing your
results (compositeKey would have to be multivalued, and it would
return as
an array)


-Hoss








Re: Composite key for uniqueKeyId

2008-03-08 Thread Noble Paul നോബിള്‍ नोब्ळ्
hi ,
The tool is undergoing substantial testing in our QA department .
Because it is an official internal project also, the bugs are filed in
our bug tool. We are fixing them as and when they are reported. It has
gone through some good iterations and it is going to power the backend
for a 2 of our products which are going to come out in a month's time.
(More in the pipeline).

Internally it has already had a 1.0  release. The next patch is going
to contain the 1.0 release + a few extra features.

We are testing with a dataset of ~3 million documents . Each document
is built by joining around 6 tables.

This is not to say that it is free of bugs. Please do the testing and
report back any bugs and we will be glad to incorporate the fixes in
the next patch.

--Noble

On Sat, Mar 8, 2008 at 4:10 PM, Erik Hatcher [EMAIL PROTECTED] wrote:
 The best thing folks can do to help with getting patches like this
  important DataImporterHandler committed to trunk is to try it out,
  report back experiences, and offer suggestions for improvement.

  Solr 1.3 will come in _good_ time, but not before its time.  There
  are many substantial changes in Solr between 1.2 and trunk and some
  more slated.  Knocking out any of these gets us closer to the release
  as well:

  http://issues.apache.org/jira/secure/IssueNavigator.jspa?sorter/
  field=statussorter/order=DESC

 Erik





  On Mar 8, 2008, at 2:48 AM, Vijay Rao wrote:
   I am also looking forward to get this checked into the trunk.
  
   Will there be a patch with Solr1.2 support?
   Cheers
   Vijay
  
   On Sat, Mar 8, 2008 at 10:11 AM, Jon Baer [EMAIL PROTECTED] wrote:
  
   That definitely sounds like the proper way to go + will try.  Im not
   too concerned w/ my keys coming back just that I can't seem to run
   the
   DataImportHandler w/o one.
  
   I was able to temporarily get around it by returning it in the entity
   query.  Ie:
  
   entity query=select concat(col1,col2,col3,col4) as id
 field name=id column=id /
   /entity
  
   BTW, the DataImportHandler seems to still be a patch, is there an
   estimation of if/when it will appear in trunk?
  
   Thanks!
  
   - Jon
  
   On Mar 7, 2008, at 8:59 PM, Chris Hostetter wrote:
  
  
   I believe Norberto ment he was handling it in his update client code
   --
   before sending the docs to Solr.
  
   Something that *seems* possible but I've never actaully tried is
   writting
   a ConcatTokenFilterFactory that queues up all the tokens and joins
   them together (using some confiured string, defaulting to )  then
   you
   could in theory do something like this...
  
  fieldType name=compositeKeyType class=solr.TextField
   omitNorms=true
analyzer
  tokenizer class=solr.KeywordTokenizerFactory/
  filter class=solr.ConcatTokenFilterFactory delim=-/
analyzer
  /fieldType
  ...
  field name=compositeKey type=compositeKeyType /
  uniqueKeycompositeKey/uniqueKey
  ...
  copyField source=type  dest=compositeKey/
  copyField source=numId dest=compositeKey/
  ...
  
   that *might* work ... but things would be a little weird when
   viewing your
   results (compositeKey would have to be multivalued, and it would
   return as
   an array)
  
  
   -Hoss
  
  
  





-- 
--Noble Paul


Re: Composite key for uniqueKeyId

2008-03-08 Thread Vijay Rao
I used it to index my DB and I found no bugs .Ours is a very simple usecase.

There were rough edges though. The error logging and messages were not up to
the mark. It aborted the entire indexing when there was a missing
'required'  field. It must just skip that document. Or give me an opotion to
configure that.

I am waiting for the next patch to file the bugs.

It gives me some confidence to know that this tool is powering AOL's
infrastructure.

Cheers
Vijay

On Sat, Mar 8, 2008 at 6:48 PM, Noble Paul നോബിള്‍ नोब्ळ् 
[EMAIL PROTECTED] wrote:

 hi ,
 The tool is undergoing substantial testing in our QA department .
 Because it is an official internal project also, the bugs are filed in
 our bug tool. We are fixing them as and when they are reported. It has
 gone through some good iterations and it is going to power the backend
 for a 2 of our products which are going to come out in a month's time.
 (More in the pipeline).

 Internally it has already had a 1.0  release. The next patch is going
 to contain the 1.0 release + a few extra features.

 We are testing with a dataset of ~3 million documents . Each document
 is built by joining around 6 tables.

 This is not to say that it is free of bugs. Please do the testing and
 report back any bugs and we will be glad to incorporate the fixes in
 the next patch.

 --Noble

 On Sat, Mar 8, 2008 at 4:10 PM, Erik Hatcher [EMAIL PROTECTED]
 wrote:
  The best thing folks can do to help with getting patches like this
   important DataImporterHandler committed to trunk is to try it out,
   report back experiences, and offer suggestions for improvement.
 
   Solr 1.3 will come in _good_ time, but not before its time.  There
   are many substantial changes in Solr between 1.2 and trunk and some
   more slated.  Knocking out any of these gets us closer to the release
   as well:
 
   http://issues.apache.org/jira/secure/IssueNavigator.jspa?sorter/
   field=statussorter/order=DESC
 
  Erik
 
 
 
 
 
   On Mar 8, 2008, at 2:48 AM, Vijay Rao wrote:
I am also looking forward to get this checked into the trunk.
   
Will there be a patch with Solr1.2 support?
Cheers
Vijay
   
On Sat, Mar 8, 2008 at 10:11 AM, Jon Baer [EMAIL PROTECTED] wrote:
   
That definitely sounds like the proper way to go + will try.  Im not
too concerned w/ my keys coming back just that I can't seem to run
the
DataImportHandler w/o one.
   
I was able to temporarily get around it by returning it in the
 entity
query.  Ie:
   
entity query=select concat(col1,col2,col3,col4) as id
  field name=id column=id /
/entity
   
BTW, the DataImportHandler seems to still be a patch, is there an
estimation of if/when it will appear in trunk?
   
Thanks!
   
- Jon
   
On Mar 7, 2008, at 8:59 PM, Chris Hostetter wrote:
   
   
I believe Norberto ment he was handling it in his update client
 code
--
before sending the docs to Solr.
   
Something that *seems* possible but I've never actaully tried is
writting
a ConcatTokenFilterFactory that queues up all the tokens and
 joins
them together (using some confiured string, defaulting to )  then
you
could in theory do something like this...
   
   fieldType name=compositeKeyType class=solr.TextField
omitNorms=true
 analyzer
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.ConcatTokenFilterFactory delim=-/
 analyzer
   /fieldType
   ...
   field name=compositeKey type=compositeKeyType /
   uniqueKeycompositeKey/uniqueKey
   ...
   copyField source=type  dest=compositeKey/
   copyField source=numId dest=compositeKey/
   ...
   
that *might* work ... but things would be a little weird when
viewing your
results (compositeKey would have to be multivalued, and it would
return as
an array)
   
   
-Hoss
   
   
   
 
 



 --
 --Noble Paul



Re: Composite key for uniqueKeyId

2008-03-07 Thread Jon Baer

Hi Norberto,

This sounds exactly what Im looking to do, do you have an example?

(Keep in mind Im using data-config.xml - DataImporter)

Im interested in merging different types of content in, ie:

NEWS12345
VIDEO12345

So Id like to end up w/ different keys per type if possible.

Thanks.

- Jon

On Mar 6, 2008, at 11:21 PM, Norberto Meijome wrote:


On Thu, 6 Mar 2008 11:33:38 -0500
Jon Baer [EMAIL PROTECTED] wrote:


Im interested to know if composite keys are now possible or if there
is anything to copyField I can use to get composite keys working for
my doc ids?


FWIW, we just do this @ doc generation time - grab several fields,  
massage them into shape, normalise, assign to docID

B
_
{Beto|Norberto|Numard} Meijome

...using the internet as it was originally intended... for the  
further research of pornography and pipebombs.


I speak for myself, not my employer. Contents may be hot. Slippery  
when wet. Reading disclaimers makes you go blind. Writing them is  
worse. You have been Warned.




Re: Composite key for uniqueKeyId

2008-03-07 Thread Jon Baer
That definitely sounds like the proper way to go + will try.  Im not  
too concerned w/ my keys coming back just that I can't seem to run the  
DataImportHandler w/o one.


I was able to temporarily get around it by returning it in the entity  
query.  Ie:


entity query=select concat(col1,col2,col3,col4) as id
  field name=id column=id /
/entity

BTW, the DataImportHandler seems to still be a patch, is there an  
estimation of if/when it will appear in trunk?


Thanks!

- Jon

On Mar 7, 2008, at 8:59 PM, Chris Hostetter wrote:



I believe Norberto ment he was handling it in his update client code  
--

before sending the docs to Solr.

Something that *seems* possible but I've never actaully tried is  
writting

a ConcatTokenFilterFactory that queues up all the tokens and joins
them together (using some confiured string, defaulting to )  then  
you

could in theory do something like this...

   fieldType name=compositeKeyType class=solr.TextField  
omitNorms=true

 analyzer
   tokenizer class=solr.KeywordTokenizerFactory/
   filter class=solr.ConcatTokenFilterFactory delim=-/
 analyzer
   /fieldType
   ...
   field name=compositeKey type=compositeKeyType /
   uniqueKeycompositeKey/uniqueKey
   ...
   copyField source=type  dest=compositeKey/
   copyField source=numId dest=compositeKey/
   ...

that *might* work ... but things would be a little weird when  
viewing your
results (compositeKey would have to be multivalued, and it would  
return as

an array)


-Hoss





Re: Composite key for uniqueKeyId

2008-03-07 Thread Noble Paul നോബിള്‍ नोब्ळ्
Good to hear that people are using DatImportHandler
In a couple of days, we are giving another patch which is cleared by
our QA  with
better error handling, messaging and a lot of new features.

A committer will have to decide on when it is good enough to be committed
--Noble

On Sat, Mar 8, 2008 at 10:11 AM, Jon Baer [EMAIL PROTECTED] wrote:
 That definitely sounds like the proper way to go + will try.  Im not
  too concerned w/ my keys coming back just that I can't seem to run the
  DataImportHandler w/o one.

  I was able to temporarily get around it by returning it in the entity
  query.  Ie:

  entity query=select concat(col1,col2,col3,col4) as id
field name=id column=id /
  /entity

  BTW, the DataImportHandler seems to still be a patch, is there an
  estimation of if/when it will appear in trunk?

  Thanks!

  - Jon



  On Mar 7, 2008, at 8:59 PM, Chris Hostetter wrote:

  
   I believe Norberto ment he was handling it in his update client code
   --
   before sending the docs to Solr.
  
   Something that *seems* possible but I've never actaully tried is
   writting
   a ConcatTokenFilterFactory that queues up all the tokens and joins
   them together (using some confiured string, defaulting to )  then
   you
   could in theory do something like this...
  
  fieldType name=compositeKeyType class=solr.TextField
   omitNorms=true
analyzer
  tokenizer class=solr.KeywordTokenizerFactory/
  filter class=solr.ConcatTokenFilterFactory delim=-/
analyzer
  /fieldType
  ...
  field name=compositeKey type=compositeKeyType /
  uniqueKeycompositeKey/uniqueKey
  ...
  copyField source=type  dest=compositeKey/
  copyField source=numId dest=compositeKey/
  ...
  
   that *might* work ... but things would be a little weird when
   viewing your
   results (compositeKey would have to be multivalued, and it would
   return as
   an array)
  
  
   -Hoss
  





-- 
--Noble Paul


Re: Composite key for uniqueKeyId

2008-03-07 Thread Vijay Rao
I am also looking forward to get this checked into the trunk.

Will there be a patch with Solr1.2 support?
Cheers
Vijay

On Sat, Mar 8, 2008 at 10:11 AM, Jon Baer [EMAIL PROTECTED] wrote:

 That definitely sounds like the proper way to go + will try.  Im not
 too concerned w/ my keys coming back just that I can't seem to run the
 DataImportHandler w/o one.

 I was able to temporarily get around it by returning it in the entity
 query.  Ie:

 entity query=select concat(col1,col2,col3,col4) as id
   field name=id column=id /
 /entity

 BTW, the DataImportHandler seems to still be a patch, is there an
 estimation of if/when it will appear in trunk?

 Thanks!

 - Jon

 On Mar 7, 2008, at 8:59 PM, Chris Hostetter wrote:

 
  I believe Norberto ment he was handling it in his update client code
  --
  before sending the docs to Solr.
 
  Something that *seems* possible but I've never actaully tried is
  writting
  a ConcatTokenFilterFactory that queues up all the tokens and joins
  them together (using some confiured string, defaulting to )  then
  you
  could in theory do something like this...
 
 fieldType name=compositeKeyType class=solr.TextField
  omitNorms=true
   analyzer
 tokenizer class=solr.KeywordTokenizerFactory/
 filter class=solr.ConcatTokenFilterFactory delim=-/
   analyzer
 /fieldType
 ...
 field name=compositeKey type=compositeKeyType /
 uniqueKeycompositeKey/uniqueKey
 ...
 copyField source=type  dest=compositeKey/
 copyField source=numId dest=compositeKey/
 ...
 
  that *might* work ... but things would be a little weird when
  viewing your
  results (compositeKey would have to be multivalued, and it would
  return as
  an array)
 
 
  -Hoss
 




Re: Composite key for uniqueKeyId

2008-03-06 Thread Norberto Meijome
On Thu, 6 Mar 2008 11:33:38 -0500
Jon Baer [EMAIL PROTECTED] wrote:

 Im interested to know if composite keys are now possible or if there  
 is anything to copyField I can use to get composite keys working for  
 my doc ids?

FWIW, we just do this @ doc generation time - grab several fields, massage them 
into shape, normalise, assign to docID
B
_
{Beto|Norberto|Numard} Meijome

...using the internet as it was originally intended... for the further research 
of pornography and pipebombs.

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.