RE: Sharing dih dictionaries

2011-12-06 Thread Brent Mills
You're totally correct.  There's actually a link on the DIH page now which 
wasn't there when I had read it a long time ago.  I'm really looking forward to 
4.0, it's got a ton of great new features.  Thanks for the links!!

-Original Message-
From: Mikhail Khludnev [mailto:mkhlud...@griddynamics.com] 
Sent: Monday, December 05, 2011 10:45 PM
To: solr-user@lucene.apache.org
Subject: Re: Sharing dih dictionaries

It looks like https://issues.apache.org/jira/browse/SOLR-2382 or even 
https://issues.apache.org/jira/browse/SOLR-2613.
I guess by using SOLR-2382 you can specify your own SortedMapBackedCache 
subclass which is able to share your Dictionary.

Regards

On Tue, Dec 6, 2011 at 12:26 AM, Brent Mills bmi...@uship.com wrote:

 I'm not really sure how to title this but here's what I'm trying to do.

 I have a query that creates a rather large dictionary of codes that 
 are shared across multiple fields of a base entity.  I'm using the 
 cachedsqlentityprocessor but I was curious if there was a way to join 
 this multiple times to the base entity so I can avoid having to reload 
 it for each column join.

 Ex:
 entity name=parts query=select name, code1, code2, code3 from 
 parts  field column=name name=name /
entity name=shareddictionary1 query=select code, description 
 from partcodes where=code=parts.code1
  field column=description name=code1desc //entity
entity name=shareddictionary2 query=select code, description 
 from partcodes where=code=parts.code2
  field column=description name=code1desc //entity
entity name=shareddictionary3 query=select code, description 
 from partcodes where=code=parts.code3
  field column=description name=code1desc //entity 
 /entity

 Kind of a simplified example but in this case the dictionary query has 
 to be run 3 times to join 3 different columns.  It would be nice if I 
 could load the data set once as an entity and specify how to join it 
 in code without requiring a separate sql query.  Any ideas?




--
Sincerely yours
Mikhail Khludnev
Developer
Grid Dynamics
tel. 1-415-738-8644
Skype: mkhludnev
http://www.griddynamics.com
 mkhlud...@griddynamics.com


RE: Sharing dih dictionaries

2011-12-06 Thread Dyer, James
Just FYI that the final piece of SOLR-2382 has not been committed, and instead 
has been spun off to SOLR-2943.  So it you're using Trunk and you need the 
ability to persist a cache on disk and then read it back again later as an DIH 
entity, you'll need both SOLR-2943 and also a cache implementation.  We're 
using the BDB-JE cache from SOLR-2613 in production.  There is also one backed 
with a Lucene index (SOLR-2948).

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Brent Mills [mailto:bmi...@uship.com] 
Sent: Tuesday, December 06, 2011 2:43 PM
To: solr-user@lucene.apache.org
Subject: RE: Sharing dih dictionaries

You're totally correct.  There's actually a link on the DIH page now which 
wasn't there when I had read it a long time ago.  I'm really looking forward to 
4.0, it's got a ton of great new features.  Thanks for the links!!

-Original Message-
From: Mikhail Khludnev [mailto:mkhlud...@griddynamics.com] 
Sent: Monday, December 05, 2011 10:45 PM
To: solr-user@lucene.apache.org
Subject: Re: Sharing dih dictionaries

It looks like https://issues.apache.org/jira/browse/SOLR-2382 or even 
https://issues.apache.org/jira/browse/SOLR-2613.
I guess by using SOLR-2382 you can specify your own SortedMapBackedCache 
subclass which is able to share your Dictionary.

Regards

On Tue, Dec 6, 2011 at 12:26 AM, Brent Mills bmi...@uship.com wrote:

 I'm not really sure how to title this but here's what I'm trying to do.

 I have a query that creates a rather large dictionary of codes that 
 are shared across multiple fields of a base entity.  I'm using the 
 cachedsqlentityprocessor but I was curious if there was a way to join 
 this multiple times to the base entity so I can avoid having to reload 
 it for each column join.

 Ex:
 entity name=parts query=select name, code1, code2, code3 from 
 parts  field column=name name=name /
entity name=shareddictionary1 query=select code, description 
 from partcodes where=code=parts.code1
  field column=description name=code1desc //entity
entity name=shareddictionary2 query=select code, description 
 from partcodes where=code=parts.code2
  field column=description name=code1desc //entity
entity name=shareddictionary3 query=select code, description 
 from partcodes where=code=parts.code3
  field column=description name=code1desc //entity 
 /entity

 Kind of a simplified example but in this case the dictionary query has 
 to be run 3 times to join 3 different columns.  It would be nice if I 
 could load the data set once as an entity and specify how to join it 
 in code without requiring a separate sql query.  Any ideas?




--
Sincerely yours
Mikhail Khludnev
Developer
Grid Dynamics
tel. 1-415-738-8644
Skype: mkhludnev
http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: Sharing dih dictionaries

2011-12-06 Thread Mikhail Khludnev
AFAIK DIH jar is separated from Solr war. Isn't there a chance to use DIH
from 4.0 in Solr 3.4?

James,
Sorry for hijacking the thread.
But, do you have a chance to review
https://issues.apache.org/jira/browse/SOLR-2947 I want to provide a patch
for fixing multi-threading in DIH. But formally speaking, this issue in
addition with https://issues.apache.org/jira/browse/SOLR-2933 blocks me.

Regards

On Wed, Dec 7, 2011 at 1:11 AM, Dyer, James james.d...@ingrambook.comwrote:

 Just FYI that the final piece of SOLR-2382 has not been committed, and
 instead has been spun off to SOLR-2943.  So it you're using Trunk and you
 need the ability to persist a cache on disk and then read it back again
 later as an DIH entity, you'll need both SOLR-2943 and also a cache
 implementation.  We're using the BDB-JE cache from SOLR-2613 in production.
  There is also one backed with a Lucene index (SOLR-2948).

 James Dyer
 E-Commerce Systems
 Ingram Content Group
 (615) 213-4311


 -Original Message-
 From: Brent Mills [mailto:bmi...@uship.com]
 Sent: Tuesday, December 06, 2011 2:43 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Sharing dih dictionaries

 You're totally correct.  There's actually a link on the DIH page now which
 wasn't there when I had read it a long time ago.  I'm really looking
 forward to 4.0, it's got a ton of great new features.  Thanks for the
 links!!

 -Original Message-
 From: Mikhail Khludnev [mailto:mkhlud...@griddynamics.com]
 Sent: Monday, December 05, 2011 10:45 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Sharing dih dictionaries

 It looks like https://issues.apache.org/jira/browse/SOLR-2382 or even
 https://issues.apache.org/jira/browse/SOLR-2613.
 I guess by using SOLR-2382 you can specify your own SortedMapBackedCache
 subclass which is able to share your Dictionary.

 Regards

 On Tue, Dec 6, 2011 at 12:26 AM, Brent Mills bmi...@uship.com wrote:

  I'm not really sure how to title this but here's what I'm trying to do.
 
  I have a query that creates a rather large dictionary of codes that
  are shared across multiple fields of a base entity.  I'm using the
  cachedsqlentityprocessor but I was curious if there was a way to join
  this multiple times to the base entity so I can avoid having to reload
  it for each column join.
 
  Ex:
  entity name=parts query=select name, code1, code2, code3 from
  parts  field column=name name=name /
 entity name=shareddictionary1 query=select code, description
  from partcodes where=code=parts.code1
   field column=description name=code1desc //entity
 entity name=shareddictionary2 query=select code, description
  from partcodes where=code=parts.code2
   field column=description name=code1desc //entity
 entity name=shareddictionary3 query=select code, description
  from partcodes where=code=parts.code3
   field column=description name=code1desc //entity
  /entity
 
  Kind of a simplified example but in this case the dictionary query has
  to be run 3 times to join 3 different columns.  It would be nice if I
  could load the data set once as an entity and specify how to join it
  in code without requiring a separate sql query.  Any ideas?
 



 --
 Sincerely yours
 Mikhail Khludnev
 Developer
 Grid Dynamics
 tel. 1-415-738-8644
 Skype: mkhludnev
 http://www.griddynamics.com
  mkhlud...@griddynamics.com




-- 
Sincerely yours
Mikhail Khludnev
Developer
Grid Dynamics
tel. 1-415-738-8644
Skype: mkhludnev
http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: Sharing dih dictionaries

2011-12-05 Thread Mikhail Khludnev
It looks like https://issues.apache.org/jira/browse/SOLR-2382 or even
https://issues.apache.org/jira/browse/SOLR-2613.
I guess by using SOLR-2382 you can specify your own SortedMapBackedCache
subclass which is able to share your Dictionary.

Regards

On Tue, Dec 6, 2011 at 12:26 AM, Brent Mills bmi...@uship.com wrote:

 I'm not really sure how to title this but here's what I'm trying to do.

 I have a query that creates a rather large dictionary of codes that are
 shared across multiple fields of a base entity.  I'm using the
 cachedsqlentityprocessor but I was curious if there was a way to join this
 multiple times to the base entity so I can avoid having to reload it for
 each column join.

 Ex:
 entity name=parts query=select name, code1, code2, code3 from parts
  field column=name name=name /
entity name=shareddictionary1 query=select code, description from
 partcodes where=code=parts.code1
  field column=description name=code1desc //entity
entity name=shareddictionary2 query=select code, description from
 partcodes where=code=parts.code2
  field column=description name=code1desc //entity
entity name=shareddictionary3 query=select code, description from
 partcodes where=code=parts.code3
  field column=description name=code1desc //entity
 /entity

 Kind of a simplified example but in this case the dictionary query has to
 be run 3 times to join 3 different columns.  It would be nice if I could
 load the data set once as an entity and specify how to join it in code
 without requiring a separate sql query.  Any ideas?




-- 
Sincerely yours
Mikhail Khludnev
Developer
Grid Dynamics
tel. 1-415-738-8644
Skype: mkhludnev
http://www.griddynamics.com
 mkhlud...@griddynamics.com