On Tue, Nov 25, 2008 at 11:35 PM, Amit Nithian <[EMAIL PROTECTED]> wrote:
> Thanks for the responses. Few follow-ups:
> 1) It seems that the CachedSQLEntityProcessor performs the where clause in
> memory on the cache. Is this cache an in memory RDBMS or maps?
It is a hashmap in memory
> 2) In the example, there were two use cases, one that is like query="select
> * from Y where xid=${X.ID}" and another where it's query="select * from Y"
> where="xid=${x.ID}. Is there any difference in how CachedSQLEntityPRocessor
> behaves? Does it know to strip off the WHERE clause and simply cache the
> "select * from Y"?
It fetches all the rows using the 'query' first.

he where="xid=x.id" (see no ${} here )
is evaluated in the map. In the map all the xid values will be kept as
keys and the lookup is done on the map after evaluating the value of
'x.id' as ${x.ID}


Then for subsequent requests it looks
>
> What are some dataset sizes that have been tested using this framework and
> what are some performance metrics?
>
> Thanks again
> Amit
>
> On Tue, Nov 25, 2008 at 7:32 AM, Noble Paul നോബിള്‍ नोब्ळ् <
> [EMAIL PROTECTED]> wrote:
>
>> every row emitted by an outer entity results in a new Sql query in the
>> inner entity. (yes 500000 queries on inner entity)So,if you wish to
>> join multiple tables then nested entities is the way to go.
>>
>> CachedSqlEntityProcessor is meant to help you reduce the number of
>> queries fired on sub-entities.
>>
>> If you get the entire table in one query (by using select * from y)
>> and use a separate where attribute , The entire set of rows in y get
>> loaded into RAM.
>>
>> If you use it w/o the where attribute, it still ends up loading the
>> entire table into the memory (it is an unbounded cache ).It can easily
>> give you an OOM.
>>
>> dod not use CachedSqlEntityProcessor for tidying up. use it if you
>> wish to save time and you have a lot of RAM
>>
>>
>> On Tue, Nov 25, 2008 at 1:52 PM, Amit Nithian <[EMAIL PROTECTED]> wrote:
>> > I am starting to look at Solr's Data Import Handler framework and am
>> quite
>> > impressed with it so far. My question is in trying to reduce the number
>> of
>> > SQL queries issued to the database and saw this entity processor.
>> >
>> > In the following example:
>> > <entity name="x" query="select * from x">
>> >    <entity name="y" query="select * from y where xid=${x.id}"
>> > processor="CachedSqlEntityProcessor">
>> >    </entity>
>> > <entity>
>> >
>> > I like the concept of having multiple entity blocks for clarity but why
>> > wouldn't I have (for DB efficiency), the following as one entity's SQL
>> > statement "select * from X,Y where x.id=y.xid" and have two fields
>> pointing
>> > at X and Y columns?  My main question though is how the
>> > CachedSQLEntityProcessor helps in this case for I want to use the
>> multiple
>> > entity blocks for cleanliness. If I have 500,000 X records, how many SQL
>> > queries in the second entity block (y) would get executed, 500000?
>> >
>> > If there is any more detailed information about the number of queries
>> > executed in different circumstances, memory overhead or way that the data
>> is
>> > brought from the database into Java  it would be much appreciated for
>> it's
>> > important for my application.
>> >
>> > Thanks in advance!
>> > Amit
>> >
>>
>>
>>
>> --
>> --Noble Paul
>>
>



-- 
--Noble Paul

Reply via email to