[jira] Assigned: (SOLR-611) Add sort_values returned by QueryComponent to solrj.response.QueryResponse

2008-07-28 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-611:
--

Assignee: Shalin Shekhar Mangar

> Add sort_values returned by QueryComponent to solrj.response.QueryResponse
> --
>
> Key: SOLR-611
> URL: https://issues.apache.org/jira/browse/SOLR-611
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dan Rosher
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.3
>
> Attachments: SOLR-611.patch
>
>
> Add sort_values returned by QueryComponent to solrj.response.QueryResponse

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-664) Highlighter (maybe phraseHighlighter) is highlighting non-highlight fields in query

2008-07-28 Thread Brian Whitman (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617405#action_12617405
 ] 

Brian Whitman commented on SOLR-664:


Ah, missed that one. Any reason this is not on by default? Does it alter 
results in any meaningful way?


> Highlighter (maybe phraseHighlighter) is highlighting non-highlight fields in 
> query
> ---
>
> Key: SOLR-664
> URL: https://issues.apache.org/jira/browse/SOLR-664
> Project: Solr
>  Issue Type: Bug
>  Components: highlighter
>Affects Versions: 1.3
>Reporter: Brian Whitman
> Fix For: 1.3
>
>
> Query: q=+content:"a b c" 
> +type:web&hl.simple.pre=&hl.simple.post=&highlight=true&hl.fl=content
> Returns docs like:
> a b c web
> Highlighter should only return fragments matched by the field denoted in the 
> highlight.
> Happens whether or not usePhraseHighlighter=true or not. (SOLR-553)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-664) Highlighter (maybe phraseHighlighter) is highlighting non-highlight fields in query

2008-07-28 Thread Brian Whitman (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Whitman resolved SOLR-664.


Resolution: Invalid

> Highlighter (maybe phraseHighlighter) is highlighting non-highlight fields in 
> query
> ---
>
> Key: SOLR-664
> URL: https://issues.apache.org/jira/browse/SOLR-664
> Project: Solr
>  Issue Type: Bug
>  Components: highlighter
>Affects Versions: 1.3
>Reporter: Brian Whitman
> Fix For: 1.3
>
>
> Query: q=+content:"a b c" 
> +type:web&hl.simple.pre=&hl.simple.post=&highlight=true&hl.fl=content
> Returns docs like:
> a b c web
> Highlighter should only return fragments matched by the field denoted in the 
> highlight.
> Happens whether or not usePhraseHighlighter=true or not. (SOLR-553)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-611) Add sort_values returned by QueryComponent to solrj.response.QueryResponse

2008-07-28 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-611.


Resolution: Fixed

Committed revision 680319.

Thanks Dan!

> Add sort_values returned by QueryComponent to solrj.response.QueryResponse
> --
>
> Key: SOLR-611
> URL: https://issues.apache.org/jira/browse/SOLR-611
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Dan Rosher
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.3
>
> Attachments: SOLR-611.patch
>
>
> Add sort_values returned by QueryComponent to solrj.response.QueryResponse

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-474) audit docs for Spellchecker

2008-07-28 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617407#action_12617407
 ] 

Shalin Shekhar Mangar commented on SOLR-474:


With the new SpellCheckComponent (SOLR-572) coming in, is this issue still 
valid?

> audit docs for Spellchecker
> ---
>
> Key: SOLR-474
> URL: https://issues.apache.org/jira/browse/SOLR-474
> Project: Solr
>  Issue Type: Task
>Affects Versions: 1.3
>Reporter: Hoss Man
>Assignee: Mike Klaas
> Fix For: 1.3
>
>
> according to this troubling comment from Mike, the highlighter javadocs (and 
> wiki) may not reflect reality...
> http://www.nabble.com/spellcheckhandler-to14627712.html#a14627712
> {quote}
> Multi-word spell checking is available only with extendedResults=true, and 
> only in trunk.  I
> believe that the current javadocs are incorrect on this point.
> {quote}
> we should audit/fix this before 1.3

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-611) Add sort_values returned by QueryComponent to solrj.response.QueryResponse

2008-07-28 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-611:
---

Affects Version/s: 1.3

> Add sort_values returned by QueryComponent to solrj.response.QueryResponse
> --
>
> Key: SOLR-611
> URL: https://issues.apache.org/jira/browse/SOLR-611
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Affects Versions: 1.3
>Reporter: Dan Rosher
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.3
>
> Attachments: SOLR-611.patch
>
>
> Add sort_values returned by QueryComponent to solrj.response.QueryResponse

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Solr 1.3 release date

2008-07-28 Thread Yonik Seeley
On Thu, Jul 10, 2008 at 11:56 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> So how about we make a 1.3 branch in about 2 weeks (the 25th), and
> release soon after.

Given the flurry of recent activity, seems like extending this a
little longer (another week?) would be a good idea.

-Yonik


Re: Solr 1.3 release date

2008-07-28 Thread Noble Paul നോബിള്‍ नोब्ळ्
Let us do another round of bug scrubbing, Identify all the issues to
be fixed in this release and defer some if it is not important for
this release.

On Mon, Jul 28, 2008 at 6:24 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On Thu, Jul 10, 2008 at 11:56 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>> So how about we make a 1.3 branch in about 2 weeks (the 25th), and
>> release soon after.
>
> Given the flurry of recent activity, seems like extending this a
> little longer (another week?) would be a good idea.
>
> -Yonik
>



-- 
--Noble Paul


[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617420#action_12617420
 ] 

Yonik Seeley commented on SOLR-665:
---

This implementation isn't thread safe (due the the removed synchronization).

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617422#action_12617422
 ] 

Noble Paul commented on SOLR-665:
-

Why do we have almost the entire _get()_ method synchronized on the map in 
LRUCache . 
stats increment can possibly be put out of synchronized 

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-614) Allow components to read any kind of XML from solrconfig

2008-07-28 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617423#action_12617423
 ] 

Erik Hatcher commented on SOLR-614:
---

I still think this is an unnecessary addition that has the potential to be 
confusing.  Even though the current config stuff is ugly to navigate, it is 
only one way that will be easier to support.

I don't want to thwart your efforts though.  -0

I'd really prefer this be taken to a post-1.3 commit at the very least though, 
so we can flesh out config syntax and infrastructure a bit more.

> Allow components to read any kind of XML from solrconfig
> 
>
> Key: SOLR-614
> URL: https://issues.apache.org/jira/browse/SOLR-614
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
>Priority: Trivial
> Fix For: 1.3
>
> Attachments: SOLR-614.patch, SOLR-614.patch, SOLR-614.patch, 
> SOLR-614.patch
>
>
> All the components initialized by Solr have an init(NamedList args) 
> initializer. This leads us to writing the configuration needed for the 
> component in the NamedList xml format. People familiar with Solr may know the 
> format but most of what is written is noise than information. For users who 
> are not familiar w/ the format find it too difficult to understand why they 
> have to write it this way. Moreover , it is not a very efficient way to 
> configure .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617427#action_12617427
 ] 

Yonik Seeley commented on SOLR-665:
---

bq. Why do we have almost the entire get() method synchronized on the map in 
LRUCache . 

It simplified the code and reduced the number of branches.  What's your 
proposed version?
Here is the current method:
{code}
  public Object get(Object key) {
synchronized (map) {
  Object val = map.get(key);
  if (state == State.LIVE) {
// only increment lookups and hits if we are live.
lookups++;
stats.lookups.incrementAndGet();
if (val!=null) {
  hits++;
  stats.hits.incrementAndGet();
}
  }
  return val;
}
  }
{code}

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-614) Allow components to read any kind of XML from solrconfig

2008-07-28 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617430#action_12617430
 ] 

Noble Paul commented on SOLR-614:
-

bq.I still think this is an unnecessary addition that has the potential to be 
confusing. Even though the current config stuff is ugly to navigate, it is only 
one way that will be easier to support. 
All the changes are under the skin. There will be no changes to the 
configuration or public API. All the components *must*  stick to the old 
configuration.So, I hope there is no need to have any confusion

> Allow components to read any kind of XML from solrconfig
> 
>
> Key: SOLR-614
> URL: https://issues.apache.org/jira/browse/SOLR-614
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
>Priority: Trivial
> Fix For: 1.3
>
> Attachments: SOLR-614.patch, SOLR-614.patch, SOLR-614.patch, 
> SOLR-614.patch
>
>
> All the components initialized by Solr have an init(NamedList args) 
> initializer. This leads us to writing the configuration needed for the 
> component in the NamedList xml format. People familiar with Solr may know the 
> format but most of what is written is noise than information. For users who 
> are not familiar w/ the format find it too difficult to understand why they 
> have to write it this way. Moreover , it is not a very efficient way to 
> configure .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617431#action_12617431
 ] 

Noble Paul commented on SOLR-665:
-

I guess this is safe and faster
{code}
public Object get(Object key) {
synchronized (map) {
  Object val = map.get(key);
  if (state == State.LIVE) {
// only increment lookups and hits if we are live.
lookups++;
  }
  stats.lookups.incrementAndGet();
  if (val != null) {
hits++;
stats.hits.incrementAndGet();
  }
  return val;
}
  }
{code}

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617431#action_12617431
 ] 

noble.paul edited comment on SOLR-665 at 7/28/08 7:10 AM:
--

I guess this is safe and faster
{code}
public Object get(Object key) {
synchronized (map) {
  Object val = map.get(key);
  if (state == State.LIVE) {
// only increment lookups and hits if we are live.
lookups++;
  }
  stats.lookups.incrementAndGet();
  if (val != null) {
//hits++; we must remove the hits field. It needs changes to 
getStatistics also
stats.hits.incrementAndGet();
  }
  return val;
}
  }
{code}

let us remove the field hits and use stats.hits wherever we need it

  was (Author: noble.paul):
I guess this is safe and faster
{code}
public Object get(Object key) {
synchronized (map) {
  Object val = map.get(key);
  if (state == State.LIVE) {
// only increment lookups and hits if we are live.
lookups++;
  }
  stats.lookups.incrementAndGet();
  if (val != null) {
hits++;
stats.hits.incrementAndGet();
  }
  return val;
}
  }
{code}
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-614) Allow components to read any kind of XML from solrconfig

2008-07-28 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617433#action_12617433
 ] 

Shalin Shekhar Mangar commented on SOLR-614:


Eric -- One reason that I'm so +1 on it is that if it does not go into 1.3, all 
custom (internal) plugins we write for 1.3 will be stuck with the ugly config 
format or clunky xpath code for at least the next six to eight months until we 
have another release.

> Allow components to read any kind of XML from solrconfig
> 
>
> Key: SOLR-614
> URL: https://issues.apache.org/jira/browse/SOLR-614
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
>Priority: Trivial
> Fix For: 1.3
>
> Attachments: SOLR-614.patch, SOLR-614.patch, SOLR-614.patch, 
> SOLR-614.patch
>
>
> All the components initialized by Solr have an init(NamedList args) 
> initializer. This leads us to writing the configuration needed for the 
> component in the NamedList xml format. People familiar with Solr may know the 
> format but most of what is written is noise than information. For users who 
> are not familiar w/ the format find it too difficult to understand why they 
> have to write it this way. Moreover , it is not a very efficient way to 
> configure .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-647) Do SolrCore.close() in a refcounted way

2008-07-28 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617434#action_12617434
 ] 

Shalin Shekhar Mangar commented on SOLR-647:


Yonik, what do you feel about adding to to 1.3? Although it's immediate use is 
in SOLR-561 but it is still an unsafe existing API call.

> Do SolrCore.close() in a refcounted way
> ---
>
> Key: SOLR-647
> URL: https://issues.apache.org/jira/browse/SOLR-647
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>Reporter: Noble Paul
>Priority: Minor
> Attachments: SOLR-647.patch
>
>
> The method _SolrCore.close()_ directly closes the core . It can cause 
> Exceptions for in-flight requests. The _close()_ method should just do a 
> decrement on refcount and the actual close must happen when the last request 
> being processed by that core instance is completed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-647) Do SolrCore.close() in a refcounted way

2008-07-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617435#action_12617435
 ] 

Yonik Seeley commented on SOLR-647:
---

I took a quick peek looks like there are probably race conditions.
A core could be obtained in thread A, then decRef() could be called in thread B 
that triggers a real close, then incRef() would be called in thread A (oops).

> Do SolrCore.close() in a refcounted way
> ---
>
> Key: SOLR-647
> URL: https://issues.apache.org/jira/browse/SOLR-647
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>Reporter: Noble Paul
>Priority: Minor
> Attachments: SOLR-647.patch
>
>
> The method _SolrCore.close()_ directly closes the core . It can cause 
> Exceptions for in-flight requests. The _close()_ method should just do a 
> decrement on refcount and the actual close must happen when the last request 
> being processed by that core instance is completed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617431#action_12617431
 ] 

noble.paul edited comment on SOLR-665 at 7/28/08 7:19 AM:
--

I guess this is safe and faster
{code}
public Object get(Object key) {
synchronized (map) {
  Object val = map.get(key);
  if (state == State.LIVE) {
// only increment lookups and hits if we are live.
lookups++;
  }
}
  stats.lookups.incrementAndGet();
  if (val != null) {
//hits++; this must be removed
stats.hits.incrementAndGet();
  }
  return val;
  }
{code}

let us remove the field hits and use stats.hits wherever we need it

  was (Author: noble.paul):
I guess this is safe and faster
{code}
public Object get(Object key) {
synchronized (map) {
  Object val = map.get(key);
  if (state == State.LIVE) {
// only increment lookups and hits if we are live.
lookups++;
  }
  stats.lookups.incrementAndGet();
  if (val != null) {
//hits++; we must remove the hits field. It needs changes to 
getStatistics also
stats.hits.incrementAndGet();
  }
  return val;
}
  }
{code}

let us remove the field hits and use stats.hits wherever we need it
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617436#action_12617436
 ] 

Yonik Seeley commented on SOLR-665:
---

Your version changes what the method does in a couple of respects though.
bq. let us remove the field hits and use stats.hits wherever we need it
stats.hits is for all caches of this time (it's shared across searchers).  hits 
is local to this cache instance.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617436#action_12617436
 ] 

[EMAIL PROTECTED] edited comment on SOLR-665 at 7/28/08 7:35 AM:


Your version changes what the method does in a couple of respects though.
bq. let us remove the field hits and use stats.hits wherever we need it
stats.hits is for all caches of this type (it's shared across searchers).  hits 
is local to this cache instance.

  was (Author: [EMAIL PROTECTED]):
Your version changes what the method does in a couple of respects though.
bq. let us remove the field hits and use stats.hits wherever we need it
stats.hits is for all caches of this time (it's shared across searchers).  hits 
is local to this cache instance.
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617441#action_12617441
 ] 

Fuad Efendi commented on SOLR-665:
--

Regarding Thread Safety:

- yes, we need synchronized get() method for LRU cache because each access 
_reorders_ LinkedHashMap.
- absolutely no need to synchronize get() method for FIFO!
- probably we need to deal with insertion, but do not synchronize it: instead, 
extend LinkedHashMap and make some 'removal' synchronized...  With caches large 
enough to store all object we do not need it. We probably do not need to 
synchronize 'removal' at all - it removes entry but does not remove/finalize 
referenced object.

>From JavaDoc: "Note that this implementation is not synchronized. If multiple 
>threads access a linked hash map concurrently, and at least
one of the threads modifies the map structurally, it must be  synchronized 
externally."
However, we do not modify cache structurally during iteration loop or any other 
'structure' access (we do not use Iterator!) - so, advice from JavaDoc is not 
applicable.

We should synchronize only removeEntryForKey of HashMap; unfortunately we can't 
do it without rewriting HashMap. Probably we can use ConcurrentHashMap as a 
base class of LinkedHashMap, but I don't know answer yet... I am guessing that 
unsynchronized entry removal won't be significant issue in multithreaded 
environment.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617442#action_12617442
 ] 

Yonik Seeley commented on SOLR-665:
---

bq. absolutely no need to synchronize get() method for FIFO!

A cache is not a read-only object.  Gets need to be synchronized because other 
threads can be changing the cache.
If anyone wants to learn more about thread safety and concurrency, I'd 
recommend "Java concurrency in practice"
http://www.javaconcurrencyinpractice.com/

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617443#action_12617443
 ] 

Fuad Efendi commented on SOLR-665:
--

- Classes from java.util.concurrent.atomic designed NOT to be synchronized, 
per-instance stats should be replaced to AtomicLong instead of {{long}}:
{{  private long lookups;
  private long hits;
  private long inserts;
  private long evictions;}}
- get() method of FIFO do not need any synchronization
- get() method of LRU reorers LinkedHashMap, unsynchronized access may cause 
orphan Entry objects
- synchronized insertion for FIFO won't cause performance degradation because 
get() is unsynchronized.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617444#action_12617444
 ] 

Yonik Seeley commented on SOLR-665:
---

bq. per-instance stats should be replaced to AtomicLong instead of long.

Since we need to synchronize anyway, it's more efficient to just stick things 
like hits++ inside the sync block instead of having a separate AtomicLong.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617445#action_12617445
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. bq. absolutely no need to synchronize get() method for FIFO!

bq. A cache is not a read-only object. Gets need to be synchronized because 
other threads can be changing the cache.

I am familiar with Doug Lea's findings (he wrote his book in 1996, and it 
became part of java.util.concurrent after 10(!!!) years). 

bq. changing the cache
- what is 'cache change'? 
Changing the stored Key, changing the referenced object - never ever happens in 
SOLR. Removal of object - yes. More correctly: removal of key. get(MyKey) will 
return null OR will return MyValue, and "ConcurentCacheModification" will never 
be thrown in SOLR (we are not using Iterator!). We can insert concurrently the 
same (key, value) pairs - not a problem.


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617447#action_12617447
 ] 

Yonik Seeley commented on SOLR-665:
---

bq. what is 'cache change'?

Adding a new key/value pair to the cache, or removal of a key/value pair from 
the cache.

bq. get(MyKey) will return null OR will return MyValue, and 
"ConcurentCacheModification" will never be thrown in SOLR

This would only be under very specific usage patterns - everything pre-cached, 
and no adds or removes once the cache is "LIVE" (accessed by more than one 
thread concurrently).
ConcurentModification is best effort, not guaranteed.  You can still get 
incorrect results or other exceptions from code that isn't thread safe, even 
when a ConcurentModification isn't thrown.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Sean Timm (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617448#action_12617448
 ] 

Sean Timm commented on SOLR-665:


Fuad--

I agree with Yonik here.  From the HashMap Javadoc:

bq. If multiple threads access this map concurrently, and at least one of the 
threads modifies the map structurally, it must be synchronized externally. (A 
structural modification is any operation that adds or deletes one or more 
mappings; merely changing the value associated with a key that an instance 
already contains is not a structural modification.)

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617454#action_12617454
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. If multiple threads access this map concurrently, and at least one of the 
threads modifies the map structurally, it must be synchronized externally. (A 
structural modification is any operation that adds or deletes one or more 
mappings; merely changing the value associated with a key that an instance 
already contains is not a structural modification.)

I already commented it. Just try to avoid 'books' and 'authors', also try to 
find meaning in JavaDocs and try to browse JavaSource instead...

- _structural_  _modification_, related _ConcurrentModificationException_, and 
related _Iterator_: not applicable to SOLR Cache. 

Will never happen. We need to synchronize inserts because each insert may 
calculate size and remove 'eldest' entry, and we need to avoid OOMs. We need to 
synchronize retrieves for LRU because 'Linked' HashMap (with Access Order) will 
change links (will reorder Map). And we do not need to synchronize retrieves 
from Insertion-Ordered LinkedHashMap (FIFO). It is classic... i'd like to 
research more java.util.concurrent


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-665.
---

Resolution: Invalid

Closing, since it's pretty much impossible to avoid all forms of 
synchronization and memory barriers for get() on a shared Map object.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617466#action_12617466
 ] 

Fuad Efendi commented on SOLR-665:
--

Why? We have PoC at least, and we know the bottleneck! 
We need improvement: avoid _some_ synchronization; it is extremely easy with 
FIFO. We may try to improve LRU too.
Not everything in JAVA is extremely good: for instance, synchronization. Even 
for single-threaded application, it needs additionally 600 CPU cycles (I've 
read it somewhere for SUN Java 1.3 on Windows)

Yonik, please allow some time to think / to discuss. 

I'll try also to provide 'concurrent' LRU; but this issue is FIFO related.


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Reopened: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi reopened SOLR-665:
--


get() method do not need to be synchronized for FIFO cache. Unsynchronized 
object retrieval is not structural modification of _Insertion-Ordered_ 
LinkedHashMap. Unsynchronized cache improves performance of multithreaded 
applications linear to number of CPUs.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-469) Data Import RequestHandler

2008-07-28 Thread Jonathan Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617472#action_12617472
 ] 

jonjlee edited comment on SOLR-469 at 7/28/08 10:00 AM:
-

When using CachedSqlEntityProcessor, an NPE is thrown 
(EntityProcessorBase.java:367) if a key value doesn't exist in the cached row 
set.   This change to EntityProcessorBase.java should fix that, or let me know 
if I've missed something here!

{noformat}
--- 
contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/EntityProcessorBase.java
 2008-07-28 12:49:21.0 -0400
+++ 
contrib.new/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/EntityProcessorBase.java
 2008-07-28 12:40:17.0 -0400
@@ -348,7 +348 @@
-if (rowIdVsRows != null) {
-  rows = rowIdVsRows.get(key);
-  if (rows == null)
-return null;
-  dataSourceRowCache = new ArrayList>(rows);
-  return getFromRowCacheTransformed();
-} else {
+if (rowIdVsRows == null) {
@@ -367,6 +360,0 @@
-dataSourceRowCache = new ArrayList>(rowIdVsRows.get(key));
-if (dataSourceRowCache.isEmpty()) {
-  dataSourceRowCache = null;
-  return null;
-}
-return getFromRowCacheTransformed();
@@ -374,0 +363,5 @@
+rows = rowIdVsRows.get(key);
+if (rows == null)
+  return null;
+dataSourceRowCache = new ArrayList>(rows);
+return getFromRowCacheTransformed();
{noformat}

  was (Author: jonjlee):
When using CachedSqlEntityProcessor, an NPE is thrown 
(EntityProcessorBase.java:367) if a key value doesn't exist in the cached row 
set.   This change to EntityProcessorBase.java should fix that, or let me know 
if I've missed something here!

{quote}
--- 
contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/EntityProcessorBase.java
 2008-07-28 12:49:21.0 -0400
+++ 
contrib.new/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/EntityProcessorBase.java
 2008-07-28 12:40:17.0 -0400
@@ -348,7 +348 @@
-if (rowIdVsRows != null) {
-  rows = rowIdVsRows.get(key);
-  if (rows == null)
-return null;
-  dataSourceRowCache = new ArrayList>(rows);
-  return getFromRowCacheTransformed();
-} else {
+if (rowIdVsRows == null) {
@@ -367,6 +360,0 @@
-dataSourceRowCache = new ArrayList>(rowIdVsRows.get(key));
-if (dataSourceRowCache.isEmpty()) {
-  dataSourceRowCache = null;
-  return null;
-}
-return getFromRowCacheTransformed();
@@ -374,0 +363,5 @@
+rows = rowIdVsRows.get(key);
+if (rows == null)
+  return null;
+dataSourceRowCache = new ArrayList>(rows);
+return getFromRowCacheTransformed();
{quote}
  
> Data Import RequestHandler
> --
>
> Key: SOLR-469
> URL: https://issues.apache.org/jira/browse/SOLR-469
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Grant Ingersoll
> Fix For: 1.3
>
> Attachments: SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
> SOLR-469-contrib.patch, SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
> SOLR-469-contrib.patch, SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
> SOLR-469-contrib.patch, SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
> SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, 
> SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch
>
>
> We need a RequestHandler Which can import data from a DB or other dataSources 
> into the Solr index .Think of it as an advanced form of SqlUpload Plugin 
> (SOLR-103).
> The way it works is as follows.
> * Provide a configuration file (xml) to the Handler which takes in the 
> necessary SQL queries and mappings to a solr schema
>   - It also takes in a properties file for the data source 
> configuraution
> * Given the configuration it can also generate the solr schema.xml
> * It is registered as a RequestHandler which can take two commands 
> do-full-import, do-delta-import
>   -  do-full-import - dumps all the data from the Database into the 
> index (based on the SQL query in configuration)
>   - do-delta-import - dumps all the data that has changed since last 
> import. (We assume a modified-timestamp column in tables)
> * It provides a admin page
>   - where we can schedule it to be run automatically at regular 
> intervals
>   - It shows the status of the Handler (idle, full-import, 
> delta-import)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-469) Data Import RequestHandler

2008-07-28 Thread Jonathan Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617472#action_12617472
 ] 

Jonathan Lee commented on SOLR-469:
---

When using CachedSqlEntityProcessor, an NPE is thrown 
(EntityProcessorBase.java:367) if a key value doesn't exist in the cached row 
set.   This change to EntityProcessorBase.java should fix that, or let me know 
if I've missed something here!

{quote}
--- 
contrib/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/EntityProcessorBase.java
 2008-07-28 12:49:21.0 -0400
+++ 
contrib.new/dataimporthandler/src/main/java/org/apache/solr/handler/dataimport/EntityProcessorBase.java
 2008-07-28 12:40:17.0 -0400
@@ -348,7 +348 @@
-if (rowIdVsRows != null) {
-  rows = rowIdVsRows.get(key);
-  if (rows == null)
-return null;
-  dataSourceRowCache = new ArrayList>(rows);
-  return getFromRowCacheTransformed();
-} else {
+if (rowIdVsRows == null) {
@@ -367,6 +360,0 @@
-dataSourceRowCache = new ArrayList>(rowIdVsRows.get(key));
-if (dataSourceRowCache.isEmpty()) {
-  dataSourceRowCache = null;
-  return null;
-}
-return getFromRowCacheTransformed();
@@ -374,0 +363,5 @@
+rows = rowIdVsRows.get(key);
+if (rows == null)
+  return null;
+dataSourceRowCache = new ArrayList>(rows);
+return getFromRowCacheTransformed();
{quote}

> Data Import RequestHandler
> --
>
> Key: SOLR-469
> URL: https://issues.apache.org/jira/browse/SOLR-469
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Grant Ingersoll
> Fix For: 1.3
>
> Attachments: SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
> SOLR-469-contrib.patch, SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
> SOLR-469-contrib.patch, SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
> SOLR-469-contrib.patch, SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
> SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, 
> SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch
>
>
> We need a RequestHandler Which can import data from a DB or other dataSources 
> into the Solr index .Think of it as an advanced form of SqlUpload Plugin 
> (SOLR-103).
> The way it works is as follows.
> * Provide a configuration file (xml) to the Handler which takes in the 
> necessary SQL queries and mappings to a solr schema
>   - It also takes in a properties file for the data source 
> configuraution
> * Given the configuration it can also generate the solr schema.xml
> * It is registered as a RequestHandler which can take two commands 
> do-full-import, do-delta-import
>   -  do-full-import - dumps all the data from the Database into the 
> index (based on the SQL query in configuration)
>   - do-delta-import - dumps all the data that has changed since last 
> import. (We assume a modified-timestamp column in tables)
> * It provides a admin page
>   - where we can schedule it to be run automatically at regular 
> intervals
>   - It shows the status of the Handler (idle, full-import, 
> delta-import)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617479#action_12617479
 ] 

Noble Paul commented on SOLR-665:
-

OK.Let us try to study the failure cases .
_get()_ does no state modification. 

So if the get is unsynchronized the worst case scenario is that we may get an 
Exception (I dunno what ConcurrentModificationException? 
ArrayIndexOutOfBoundsException? ). How often does it happen? What is the 
probability?

What is the big deal ?(the system won't crash). We can catch the Exception and 
return null as if the entry was not found . That means that Solr may have to 
recompute results where it did not have to (this is a cost). 

here we totally eliminated the cost of synchronization (benefit). I guess if 
you do a cost benefit analysis this does not turn out to be as bad as it is 
projected to be. 

And after all the user is knowingly using this fully aware of the cost. is 
there anything else I have not considered 


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617481#action_12617481
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. ...and at least one of the threads modifies the map structurally, it 
must be synchronized externally.

- so that thread running on _insert_ must be synchronized and _not_ thread 
running on _retrieve_. Again, we need synchronize _insert_ only to avoid memory 
leaks, not more. SOLR does not iterate Map.


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617482#action_12617482
 ] 

Fuad Efendi commented on SOLR-665:
--

ConcurrentModificationException is thrown only when we iterate Map and another 
thread modified structure; with LRU each get() modifies structure, with FIFO - 
only inserts...

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-469) Data Import RequestHandler

2008-07-28 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617483#action_12617483
 ] 

Shalin Shekhar Mangar commented on SOLR-469:


Nice catch! We shall incorporate the fix into the next patch.

Yes indeed it can throw NullPointerException when key value does not exist in 
cached row set. However, I am wondering what can cause such a cache miss.

> Data Import RequestHandler
> --
>
> Key: SOLR-469
> URL: https://issues.apache.org/jira/browse/SOLR-469
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Grant Ingersoll
> Fix For: 1.3
>
> Attachments: SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
> SOLR-469-contrib.patch, SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
> SOLR-469-contrib.patch, SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
> SOLR-469-contrib.patch, SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
> SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, 
> SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch
>
>
> We need a RequestHandler Which can import data from a DB or other dataSources 
> into the Solr index .Think of it as an advanced form of SqlUpload Plugin 
> (SOLR-103).
> The way it works is as follows.
> * Provide a configuration file (xml) to the Handler which takes in the 
> necessary SQL queries and mappings to a solr schema
>   - It also takes in a properties file for the data source 
> configuraution
> * Given the configuration it can also generate the solr schema.xml
> * It is registered as a RequestHandler which can take two commands 
> do-full-import, do-delta-import
>   -  do-full-import - dumps all the data from the Database into the 
> index (based on the SQL query in configuration)
>   - do-delta-import - dumps all the data that has changed since last 
> import. (We assume a modified-timestamp column in tables)
> * It provides a admin page
>   - where we can schedule it to be run automatically at regular 
> intervals
>   - It shows the status of the Handler (idle, full-import, 
> delta-import)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617490#action_12617490
 ] 

Shalin Shekhar Mangar commented on SOLR-665:


I think Fuad has a point. In an Insertion ordered LinkedHashMap, get makes no 
structural modification and if we synchronize put/remove, we should be fine. 
The cache warming is already thread-safe and we don't have iterators anywhere. 
Am I missing something here?

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Mark Miller
Depends on if you want to go by the javadoc or not - it says that if 
*any* of the threads accessing the map concurrently make a structural 
change, the map must be synchronized.  It's clear that get does not make 
a structural change when using  insertion order,  but can  any other 
threads possibly make a mapping change while calling get? Sounds like 
yes, and so the contract would seem to indicate you synchronize...


Put can modify shared variables that get accesses - sounds like 
dangerous ground to me. If it works, its got to be sneaky enough to 
warrant code smell...


- Mark

Shalin Shekhar Mangar (JIRA) wrote:
[ https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617490#action_12617490 ] 


Shalin Shekhar Mangar commented on SOLR-665:


I think Fuad has a point. In an Insertion ordered LinkedHashMap, get makes no 
structural modification and if we synchronize put/remove, we should be fine. 
The cache warming is already thread-safe and we don't have iterators anywhere. 
Am I missing something here?

  

FIFO Cache (Unsynchronized): 9x times performance boost
---

Key: SOLR-665
URL: https://issues.apache.org/jira/browse/SOLR-665
Project: Solr
 Issue Type: Improvement
   Affects Versions: 1.3
Environment: JRockit R27 (Java 6)
   Reporter: Fuad Efendi
Attachments: FIFOCache.java

  Original Estimate: 672h
 Remaining Estimate: 672h

Attached is modified version of LRUCache where 
1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that "reordering"/true (performance bottleneck of LRU) is replaced to "insertion-order"/false (so that it became FIFO)

2. Almost all (absolutely unneccessary) synchronized statements commented out
See discussion at 
http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
Performance metrics (taken from SOLR Admin):
LRU
Requests: 7638
Average Time-Per-Request: 15300
Average Request-per-Second: 0.06
FIFO:
Requests: 3355
Average Time-Per-Request: 1610
Average Request-per-Second: 0.11
Performance increased 9 times which roughly corresponds to a number of CPU in a 
system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
Current number of documents: 7494689
name: 	 filterCache  
class: 	org.apache.solr.search.LRUCache  
version: 	1.0  
description: 	LRU Cache(maxSize=1000, initialSize=1000)  
stats: 	lookups : 15966954582

hits : 16391851546
hitratio : 0.102
inserts : 4246120
evictions : 0
size : 2668705
cumulative_lookups : 16415839763
cumulative_hits : 16411608101
cumulative_hitratio : 0.99
cumulative_inserts : 4246246
cumulative_evictions : 0 
Thanks



  




[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617509#action_12617509
 ] 

Mark Miller commented on SOLR-665:
--

Sent to the mailing List:

>>Depends on if you want to go by the javadoc or not - it says that if *any* of 
>>the threads accessing the map concurrently make a structural change, the map 
>>must be synchronized.  It's clear that get does not >>make a structural 
>>change when using  insertion order,  but can  any other threads possibly make 
>>a mapping change while calling get? Sounds like yes, and so the contract 
>>would seem to indicate you >>synchronize...

>>Put can modify shared variables that get accesses - sounds like dangerous 
>>ground to me. If it works, its got to be sneaky enough to warrant code 
>>smell...

>>- Mark

Further:

Here is just one of possibly many problems - a put call can resize and make an 
entirely new table array. A get call can do something like the following with 
the table array:

table[indexFor(hash, table.length)

Do to execution reordering / memory barriers, would seem to me that the table 
being indexed into and the table.length may not be the values you were hoping 
for...



> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Mike Klaas (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617512#action_12617512
 ] 

Mike Klaas commented on SOLR-665:
-

I haven't looked at the proposed code at all, but it _is_ possible to design 
this kind of datastructure, with much care:

http://www.ddj.com/hpc-high-performance-computing/208801974


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617514#action_12617514
 ] 

Shalin Shekhar Mangar commented on SOLR-665:


Yes, I'm sorry for speaking out of turn here.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617515#action_12617515
 ] 

Noble Paul commented on SOLR-665:
-

OK . I take my words back. LinkedHashMap is backed a by a HashMap . The table 
can get resized during the lifetime of a map (and hence during a get operation 
also). So the get() may return a wrong value (which is unacceptable) and it may 
never throw an Exception. 

If we ever want to have an LRUCache with a non -synchronized get() we must have 
it backed by a ConcurrentHashMap

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617516#action_12617516
 ] 

Fuad Efendi commented on SOLR-665:
--

Mark, thanks for going deeper. Yes, _resize_ may change ({code}Entry[]{code}) 
_table_, _key_ will disappear from _bucket_ and _get() returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617516#action_12617516
 ] 

funtick edited comment on SOLR-665 at 7/28/08 12:06 PM:


Mark, thanks for going deeper. Yes, _resize_ may change ( Entry[ ] ) _table_, 
_key_ will disappear from _bucket_ and _get() returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.

  was (Author: funtick):
Mark, thanks for going deeper. Yes, _resize_ may change 
({code}Entry[]{code}) _table_, _key_ will disappear from _bucket_ and _get() 
returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617516#action_12617516
 ] 

funtick edited comment on SOLR-665 at 7/28/08 12:12 PM:


Mark, thanks for going deeper. Yes, _resize_ may change ( Entry[ ] ) _table_, 
_key_ will disappear from _bucket_ and _get() returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.

- get() will _never_ return wrong value (Entry object is immutable):
{code}
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
{code}



  was (Author: funtick):
Mark, thanks for going deeper. Yes, _resize_ may change ( Entry[ ] ) 
_table_, _key_ will disappear from _bucket_ and _get() returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617516#action_12617516
 ] 

funtick edited comment on SOLR-665 at 7/28/08 12:14 PM:


Mark, thanks for going deeper. Yes, _resize_ may change ( Entry[ ] ) _table_, 
_key_ will disappear from _bucket_ and _get() returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.

- get() will _never_ return wrong value (Entry object is immutable in our case):
{code}
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
{code}



  was (Author: funtick):
Mark, thanks for going deeper. Yes, _resize_ may change ( Entry[ ] ) 
_table_, _key_ will disappear from _bucket_ and _get() returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.

- get() will _never_ return wrong value (Entry object is immutable):
{code}
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
{code}


  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617517#action_12617517
 ] 

Shalin Shekhar Mangar commented on SOLR-665:


Fuad, before going further on this, we must also consider users who use 
alternate JVM implementations. Since the contract is to synchronize, we cannot 
base our decision on Sun JDK's implementation.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617518#action_12617518
 ] 

Mark Miller commented on SOLR-665:
--

I don't have time to look at this right now, but I'll certainly spend some time 
tonight Efendi.

A quick to my point though: I mentioned something that might be done in an 
implementation that could be an issue - you came back with a specific 
implementation that may (or may not, I don't have time at the moment) not work 
anyway. We can't program to an implementation though - we must program to the 
contract.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617519#action_12617519
 ] 

Yonik Seeley commented on SOLR-665:
---

resize() is only the most obvious problem... there are *tons* of ways this can 
fail, even if you get around the resize (and your "null" fix for resize() won't 
work).  Some of the failures can be in the form of incorrect results rather 
than null pointers or exceptions (so you can't just retry).

I'll reiterate:
bq. it's pretty much impossible to avoid all forms of synchronization and 
memory barriers for get() on a shared Map object.


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617520#action_12617520
 ] 

Fuad Efendi commented on SOLR-665:
--

Shalin, we are already using _AtomicLong of Java 5_; JVM is defferent story... 
JRockit R27 is JVM from BEA-Oracle, and its JDK 6 powered (rt.jar comes from 
SUN).
I just tried to compare synchronized with unsynchronized and found it _is_ the 
main problem for faceting...
Another problem: somehow faceting recalculates each time (using filterCache 
during repeated _recalculations_), queryCache is not enough...

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617523#action_12617523
 ] 

Fuad Efendi commented on SOLR-665:
--

We need to invite Doug Lea to this discussion...
http://en.wikipedia.org/wiki/Doug_Lea
http://gee.cs.oswego.edu/dl/index.html

We may simply use java.util.concurrent.locks instead of heavy synchronized... 
we may also use Executor framework instead of single-thread faceting... We may 
even base SOLR on Apache MINA project.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617526#action_12617526
 ] 

Shalin Shekhar Mangar commented on SOLR-665:


bq. I just tried to compare synchronized with unsynchronized and found it is 
the main problem for faceting...
Fuad, I am completely with you on this. Everybody will agree that a more 
efficient implementation will be a very useful addition to Solr. However, 
Yonik's concerns on the current patch are valid and we cannot go ahead with the 
current one.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
 ] 

Fuad Efendi commented on SOLR-665:
--

concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
 ] 

funtick edited comment on SOLR-665 at 7/28/08 12:51 PM:


concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

  was (Author: funtick):
concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.

  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
 ] 

funtick edited comment on SOLR-665 at 7/28/08 12:54 PM:


concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

  was (Author: funtick):
concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
 ] 

funtick edited comment on SOLR-665 at 7/28/08 1:04 PM:
---

concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

{code}
/** 
 * Transfer all entries from current table to newTable.
 */
void transfer(Entry[] newTable) {
Entry[] src = table;
int newCapacity = newTable.length;
for (int j = 0; j < src.length; j++) {
Entry e = src[j];
if (e != null) {
src[j] = null;
do {
Entry next = e.next;
int i = indexFor(e.hash, newCapacity);  
e.next = newTable[i];
newTable[i] = e;
e = next;
} while (e != null);
}
}
}
{code}

- We won't have even any NullPointerException after src[j] = null.

P.S.
Of course, I agree - it is Java internals, and it is not public Map 
interface-_contract_ - should we avoid to use implementation then? and base 
decision on specific implementation from SUN? I believe it is specified 
somewhere in JSR too...
{code}
 * @author  Doug Lea
 * @author  Josh Bloch
 * @author  Arthur van Hoff
 * @author  Neal Gafter
 * @version 1.65, 03/03/05
{code}

  was (Author: funtick):
concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> 

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
 ] 

funtick edited comment on SOLR-665 at 7/28/08 1:05 PM:
---

concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

{code}
/** 
 * Transfer all entries from current table to newTable.
 */
void transfer(Entry[] newTable) {
Entry[] src = table;
int newCapacity = newTable.length;
for (int j = 0; j < src.length; j++) {
Entry e = src[j];
if (e != null) {
src[j] = null;
do {
Entry next = e.next;
int i = indexFor(e.hash, newCapacity);  
e.next = newTable[i];
newTable[i] = e;
e = next;
} while (e != null);
}
}
}
{code}

- We won't have even any NullPointerException after src[j] = null.

P.S.
Of course, I agree - it is Java internals, and it is not public Map 
interface-_contract_ - should we avoid to use implementation then? I believe it 
is specified somewhere in JSR too...
{code}
 * @author  Doug Lea
 * @author  Josh Bloch
 * @author  Arthur van Hoff
 * @author  Neal Gafter
 * @version 1.65, 03/03/05
{code}

  was (Author: funtick):
concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

{code}
/** 
 * Transfer all entries from current table to newTable.
 */
void transfer(Entry[] newTable) {
Entry[] src = table;
int newCapacity = newTable.length;
for (int j = 0; j < src.length; j++) {
Entry e = src[j];
if (e != null) {
src[j] = null;
do {
Entry next = e.next;
int i = indexFor(e.hash, newCapacity);  
e.next = newTable[i];
newTable[i] = e;
e = next;
} while (e != null);
}
}
}
{code}

- We won't have even any NullPointerException after src[j] = null.

P.S.
Of course, I agree - it is Java internals, and it is not public Map 
interface-_contract_ - should we avoid to use implementation then? and base 
decision on specific implementation from SUN? I believe it is specified 
somewhere in JSR too..

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Mike Klaas (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617549#action_12617549
 ] 

Mike Klaas commented on SOLR-665:
-

[quote]We may simply use java.util.concurrent.locks instead of heavy 
synchronized... we may also use Executor framework instead of single-thread 
faceting... We may even base SOLR on Apache MINA project.[/quote]

Simply replacing synchronized with java.util.concurrent.locks doesn't increase 
performance.  There needs to be a specific strategy for employing these locks 
in a way that makes sense.

For instance, one idea would be to create a read/write lock with the put()'s 
covered by write and get()'s covered by read.  This would allow multiple 
parallel reads and will be thread-safe.  Another is to create something like 
ConcurrentLinkedHashMap.

These strategies should be tested before trying to create a lock-free get() 
version, which if even possible, would rely deeply on the implementation (such 
a structure would have to be created from scratch, I believe).  I'd expect 
anyone that is able to create such a thing be familiar enough wiht memory 
barriers and such issues to be able to deeply explain the problems with 
double-checked locking off the top of their head (and immediately see such 
problems in other code)

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617562#action_12617562
 ] 

Fuad Efendi commented on SOLR-665:
--

{quote}Simply replacing synchronized with java.util.concurrent.locks doesn't 
increase performance. There needs to be a specific strategy for employing these 
locks in a way that makes sense.{quote}

I absolutely agree... 

ConcurrentHashMap is based on some level of acceptable safety, for specific 
tasks only
bq. They do not throw ConcurrentModificationException. However, iterators are 
designed to be used by only one thread at a time. 

We can try to design specific caches directly implementing Map or ConcurrentMap 
interfaces. We should define 'safety' levels (for instance, _null_ is not a 
problem if cache already contains this object added by another thread 
concurrently; cache elements are explicitly immutable objects; and etc.) 
FIFO looks simplest and it _does_ work indeed; for LRU we need reordering for 
each get(), _OR_ we can make it weaker using weak (approximate) reordering 
somehow...

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-614) Allow components to read any kind of XML from solrconfig

2008-07-28 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617563#action_12617563
 ] 

Hoss Man commented on SOLR-614:
---

bq. All the changes are under the skin. There will be no changes to the 
configuration or public API. All the components must stick to the old 
configuration.So, I hope there is no need to have any confusion

I'm only half certain I understand exactly what's being discussed, so take this 
comment with a grain of salt...

Even if the code changes dictate that existing components must still use the 
existing config syntax (I assume because it asks for things like 
params.get("foo") instead of params.get("@foo") the fact remains that if a 
single type of plugin (ie: RequestHandler) *can* support multiple config 
syntaxes the potential exists for people to get very confused ... if we make 
this change, then someone who is already familiar with the way handlers are 
configured won't understand the "/replication" example Noble posted -- likewise 
someone new to Solr who sees that "/replication" example config will have a 
harder time understanding the 'old' config style for other request handlers.

There is a *lot* of value in maintaining consistency -- even if it's ugly.

Looking ahead two or three moves: adding support for something like this now 
would also probably make it that much harder to write a "converter" for 
existing solr config files if/when we switch to Spring or some other Java 
object wiring/configuration system.  It's a minor problem, but it has occurred 
to me.



> Allow components to read any kind of XML from solrconfig
> 
>
> Key: SOLR-614
> URL: https://issues.apache.org/jira/browse/SOLR-614
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
>Priority: Trivial
> Fix For: 1.3
>
> Attachments: SOLR-614.patch, SOLR-614.patch, SOLR-614.patch, 
> SOLR-614.patch
>
>
> All the components initialized by Solr have an init(NamedList args) 
> initializer. This leads us to writing the configuration needed for the 
> component in the NamedList xml format. People familiar with Solr may know the 
> format but most of what is written is noise than information. For users who 
> are not familiar w/ the format find it too difficult to understand why they 
> have to write it this way. Moreover , it is not a very efficient way to 
> configure .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-474) audit docs for Spellchecker

2008-07-28 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-474:
--

Description: 
according to this troubling comment from Mike, the spellchecker handler 
javadocs (and wiki) may not reflect reality...

http://www.nabble.com/spellcheckhandler-to14627712.html#a14627712

{quote}
Multi-word spell checking is available only with extendedResults=true, and only 
in trunk.  I
believe that the current javadocs are incorrect on this point.
{quote}

we should audit/fix this before 1.3


  was:
according to this troubling comment from Mike, the highlighter javadocs (and 
wiki) may not reflect reality...

http://www.nabble.com/spellcheckhandler-to14627712.html#a14627712

{quote}
Multi-word spell checking is available only with extendedResults=true, and only 
in trunk.  I
believe that the current javadocs are incorrect on this point.
{quote}

we should audit/fix this before 1.3



> audit docs for Spellchecker
> ---
>
> Key: SOLR-474
> URL: https://issues.apache.org/jira/browse/SOLR-474
> Project: Solr
>  Issue Type: Task
>Affects Versions: 1.3
>Reporter: Hoss Man
>Assignee: Mike Klaas
> Fix For: 1.3
>
>
> according to this troubling comment from Mike, the spellchecker handler 
> javadocs (and wiki) may not reflect reality...
> http://www.nabble.com/spellcheckhandler-to14627712.html#a14627712
> {quote}
> Multi-word spell checking is available only with extendedResults=true, and 
> only in trunk.  I
> believe that the current javadocs are incorrect on this point.
> {quote}
> we should audit/fix this before 1.3

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-474) audit docs for Spellchecker

2008-07-28 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617578#action_12617578
 ] 

Hoss Man commented on SOLR-474:
---

if there's code included in a release, the documentation for that code needs to 
be accurate -- or at the very least, removed if it's currently inaccurate.


> audit docs for Spellchecker
> ---
>
> Key: SOLR-474
> URL: https://issues.apache.org/jira/browse/SOLR-474
> Project: Solr
>  Issue Type: Task
>Affects Versions: 1.3
>Reporter: Hoss Man
>Assignee: Mike Klaas
> Fix For: 1.3
>
>
> according to this troubling comment from Mike, the spellchecker handler 
> javadocs (and wiki) may not reflect reality...
> http://www.nabble.com/spellcheckhandler-to14627712.html#a14627712
> {quote}
> Multi-word spell checking is available only with extendedResults=true, and 
> only in trunk.  I
> believe that the current javadocs are incorrect on this point.
> {quote}
> we should audit/fix this before 1.3

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-474) audit docs for Spellchecker

2008-07-28 Thread Mike Klaas (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617580#action_12617580
 ] 

Mike Klaas commented on SOLR-474:
-

I will look at this before release.

> audit docs for Spellchecker
> ---
>
> Key: SOLR-474
> URL: https://issues.apache.org/jira/browse/SOLR-474
> Project: Solr
>  Issue Type: Task
>Affects Versions: 1.3
>Reporter: Hoss Man
>Assignee: Mike Klaas
> Fix For: 1.3
>
>
> according to this troubling comment from Mike, the spellchecker handler 
> javadocs (and wiki) may not reflect reality...
> http://www.nabble.com/spellcheckhandler-to14627712.html#a14627712
> {quote}
> Multi-word spell checking is available only with extendedResults=true, and 
> only in trunk.  I
> believe that the current javadocs are incorrect on this point.
> {quote}
> we should audit/fix this before 1.3

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
 ] 

funtick edited comment on SOLR-665 at 7/28/08 3:12 PM:
---

concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

{code}
/** 
 * Transfer all entries from current table to newTable.
 */
void transfer(Entry[] newTable) {
Entry[] src = table;
int newCapacity = newTable.length;
for (int j = 0; j < src.length; j++) {
Entry e = src[j];
if (e != null) {
src[j] = null;
do {
Entry next = e.next;
int i = indexFor(e.hash, newCapacity);  
e.next = newTable[i];
newTable[i] = e;
e = next;
} while (e != null);
}
}
}
{code}

- We won't have even any NullPointerException after src[j] = null.

P.S.
Of course, I agree - it is Java internals, and it is not public Map 
interface-_contract_ - should we avoid to use implementation then? I believe it 
is specified somewhere in JSR too...
{code}
 * @author  Doug Lea
 * @author  Josh Bloch
 * @author  Arthur van Hoff
 * @author  Neal Gafter
 * @version 1.65, 03/03/05
{code}

P.P.S.
Do not forget to look at the top of this discussion:
{code}
description: xxx Cache(maxSize=1000, initialSize=1000) 
size : 2668705
cumulative_inserts : 4246246
{code}

- _cumulative_inserts_ is almost double of _size_ which shows that 
double-inserts are real 
- I checked catalina_out: no any NullPointerException, 
ArrayIndexOutOfBoundsException, and etc.
- I don't think we should be worried _too much_ about possible change of Map 
implementation by SUN :P... in this case we should use neither java.lang.String 
nor java.util.Date (some are placed in wrong packages).

  was (Author: funtick):
concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

{code}
/** 
 * Transfer all entries from current table to newTable.
 */
void transfer(Entry[] newTable) {
Entry[] src = table;
int newCapacity = newTable.length;
for (int j = 0; j < src.length; j++) {
Entry e = src[j];
if (e != null) {
src[j] = null;
do {
   

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
 ] 

funtick edited comment on SOLR-665 at 7/28/08 3:46 PM:
---

concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

{code}
/** 
 * Transfer all entries from current table to newTable.
 */
void transfer(Entry[] newTable) {
Entry[] src = table;
int newCapacity = newTable.length;
for (int j = 0; j < src.length; j++) {
Entry e = src[j];
if (e != null) {
src[j] = null;
do {
Entry next = e.next;
int i = indexFor(e.hash, newCapacity);  
e.next = newTable[i];
newTable[i] = e;
e = next;
} while (e != null);
}
}
}
{code}

- We won't have even any NullPointerException after src[j] = null.

P.S.
Of course, I agree - it is Java internals, and it is not public Map 
interface-_contract_ - should we avoid to use implementation then? I believe it 
is specified somewhere in JSR too...
{code}
 * @author  Doug Lea
 * @author  Josh Bloch
 * @author  Arthur van Hoff
 * @author  Neal Gafter
 * @version 1.65, 03/03/05
{code}

P.P.S.
Do not forget to look at the top of this discussion:
{code}
description: xxx Cache(maxSize=1000, initialSize=1000) 
size : 2668705
cumulative_inserts : 4246246
{code}

- _cumulative_inserts_ is almost double of _size_ which shows that 
double-inserts are real 
- I checked catalina_out: no any NullPointerException, 
ArrayIndexOutOfBoundsException, and etc.
- I don't think we should be worried _too much_ about possible change of Map 
implementation by SUN :P... in this case we should use neither java.lang.String 
nor java.util.Date (some are placed in wrong packages).
- about thread safety: some participants are wrongly guessing that making 
object access totally synchronized will make their code thread-safe... deadlock.

  was (Author: funtick):
concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

{code}
/** 
 * Transfer all entries from current table to newTable.
 */
void transfer(Entry[] newTable) {
Entry[] src = table;
int newCapacity = newTable.length;
for (int 

Re: Pulling statistics from solr for different time slices

2008-07-28 Thread Chris Hostetter

: Before I create an issue, is it even possible / desirable to add handlerStart
: and totaltime to the stats that the RequestHandlerBase reports?

are you talking about the aggregate stats reported by 
RequestHandlerBase.getStatistics() ?

I see no reason not to ... 
Committed revision 680553.



-Hoss



Re: Pulling statistics from solr for different time slices

2008-07-28 Thread Mark Miller

Chris Hostetter wrote:

: Before I create an issue, is it even possible / desirable to add handlerStart
: and totaltime to the stats that the RequestHandlerBase reports?

are you talking about the aggregate stats reported by 
RequestHandlerBase.getStatistics() ?


I see no reason not to ... 
Committed revision 680553.




-Hoss

  
I am, thanks Hoss. The issue was that I don't want the full average, I 
want the average over a given sample period, and including that info 
lets you get it.


[jira] Commented: (SOLR-256) Stats via JMX

2008-07-28 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617632#action_12617632
 ] 

Hoss Man commented on SOLR-256:
---

Shalin:

I'm really sorry, I'm way behind schedule on my "patch review" responsibilities.

skimming the patch, I see no red flags.

Some minor misc nitpicks... 

* I'd prefer if we removed the hard coded port number in TestJmxMonitoredMap 
(hard coded stuff like that has caused us nothing but problems in the past).  
If i'm understanding the JMXServiceURL javadocs correctly, let's hardcode 
port=0 for the url constructed by the test ... then add a getJMXServiceURL() to 
JmxMonitoredMap that the test can then call and pass to to the 
JMXConnectorFactory ... that should give us either a random port that isn't in 
use by anyone else, right?
* it seems like it would be a little cleaner if SolrIndexSearcher.register() 
registered itself before it registered it's subcomponents .. not sure that it 
really matters, but it certain reads a little weird.
* SolrIndexSearcher.register() is logging '"Registering new searcher: " + 
System.currentTimeMillis()' for every cache it registers ... that seems like a 
cut/paste mistake.
* is there any reason not to have the searcher's add/remove themselves using 
their true name on register()/close() *and* have register() call  
put("searcher", this) like you have it now? ... that way you'd get the benefits 
you mentioned before (continuous monitoring of the current searcher) but you 
could also get information about how many "live" searchers there currently are, 
and what their stats look like (so you could, for example, notice when there is 
a really old Searcher hanging around for some inexplicable reason, probably a 
bug.)
* couldn't we completely eliminate any overhead of JMX for people who haven't 
enabled it by adding an "isEnabled()" method to JmxMonitoredMap that returns 
true if server!=null and then make the SolrCore changes look like...
{code}
 //Initialize Registry w/JMX if enabled
 JmxMonitoredMap tmp = new 
JmxMonitoredMap(name, config.jmxConfig);
 infoRegistry = (tmp.isEnabled() ? tmp : new 
HashMap() );
{code}





> Stats via JMX
> -
>
> Key: SOLR-256
> URL: https://issues.apache.org/jira/browse/SOLR-256
> Project: Solr
>  Issue Type: New Feature
>  Components: search, update
>Reporter: Sharad Agarwal
>Assignee: Hoss Man
>Priority: Minor
> Fix For: 1.3
>
> Attachments: jmx.patch, jmx.patch, jmx.patch, jmx.patch, jmx.patch, 
> SOLR-256.patch, SOLR-256.patch, SOLR-256.patch, SOLR-256.patch, 
> SOLR-256.patch, SOLR-256.patch, SOLR-256.patch, SOLR-256.patch, SOLR-256.patch
>
>
> This patch adds JMX capability to get statistics from all the SolrInfoMBean.
> The implementation is done such a way to minimize code changes. 
> In SolrInfoRegistry, I have overloaded Map's  put and remove methods to 
> register and unregister SolrInfoMBean in MBeanServer. 
> Later on, I am planning to use register and unregister methods in 
> SolrInfoRegistry and removing getRegistry() method (Hiding the map instance 
> to other classes)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-256) Stats via JMX

2008-07-28 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-256:
--

Assignee: Shalin Shekhar Mangar  (was: Hoss Man)

Shalin's been doing all the work on this puppy, no reason for it to be assigned 
to me anymore.

Commit whenever you're ready dude.

> Stats via JMX
> -
>
> Key: SOLR-256
> URL: https://issues.apache.org/jira/browse/SOLR-256
> Project: Solr
>  Issue Type: New Feature
>  Components: search, update
>Reporter: Sharad Agarwal
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.3
>
> Attachments: jmx.patch, jmx.patch, jmx.patch, jmx.patch, jmx.patch, 
> SOLR-256.patch, SOLR-256.patch, SOLR-256.patch, SOLR-256.patch, 
> SOLR-256.patch, SOLR-256.patch, SOLR-256.patch, SOLR-256.patch, SOLR-256.patch
>
>
> This patch adds JMX capability to get statistics from all the SolrInfoMBean.
> The implementation is done such a way to minimize code changes. 
> In SolrInfoRegistry, I have overloaded Map's  put and remove methods to 
> register and unregister SolrInfoMBean in MBeanServer. 
> Later on, I am planning to use register and unregister methods in 
> SolrInfoRegistry and removing getRegistry() method (Hiding the map instance 
> to other classes)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617639#action_12617639
 ] 

Fuad Efendi commented on SOLR-665:
--

This is extremely simple Concurrent LRU, I spent an hour to create it; it is 
based on ConcurrentHashMap. I don't use java.util.concurrent.locks, and I am 
trying to focus on _requirements only_ avoiding implementing unnecessary 
methods of Map interface (so that I am not following _contract_ ;) very sorry!)

{code}
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

public class ConcurrentLRU {

protected ConcurrentHashMap> map;
protected int maxEntries;

public ConcurrentLRU(int maxEntries) {
map = new ConcurrentHashMap>();
this.maxEntries = maxEntries;
}

public V put(K key, V value) {
ValueWrapper wrapper = map.put(key, new 
ValueWrapper(value));
checkRemove();
return value;
}

void checkRemove() {
if (map.size() <= maxEntries)
return;
Map.Entry> eldestEntry = null;
long eldestAge = Long.MAX_VALUE;
for (Map.Entry> entry : map.entrySet()) {
long popularity = entry.getValue().popularity;
if (eldestEntry == null || 
eldestEntry.getValue().popularity > popularity) {
eldestEntry = entry;
}
}
map.remove(eldestEntry.getKey(), eldestEntry.getValue());
}

public V get(Object key) {
ValueWrapper wrapper = map.get(key);
return wrapper == null ? null : wrapper.getValue();
}

public final static class ValueWrapper {
static volatile long popularityCounter;
volatile long popularity;
V value;

ValueWrapper(V value) {
this.value = value;
popularity = popularityCounter++;
}

public boolean equals(Object o) {
if (!(o instanceof ValueWrapper)) {
return false;
}
return (value == null ? ((ValueWrapper) o).value == 
null : value.equals(((ValueWrapper) o).value));
}

public int hashCode() {
return value.hashCode();
}

public V getValue() {
popularity = popularityCounter++;
return value;
}

}

}
{code}

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617639#action_12617639
 ] 

funtick edited comment on SOLR-665 at 7/28/08 6:21 PM:
---

This is extremely simple Concurrent LRU, I spent an hour to create it; it is 
based on ConcurrentHashMap. I don't use java.util.concurrent.locks, and I am 
trying to focus on _requirements only_ avoiding implementing unnecessary 
methods of Map interface (so that I am not following _contract_ ;) very sorry!)

{code}
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

public class ConcurrentLRU {

protected ConcurrentHashMap> map;
protected int maxEntries;

public ConcurrentLRU(int maxEntries) {
map = new ConcurrentHashMap>();
this.maxEntries = maxEntries;
}

public V put(K key, V value) {
ValueWrapper wrapper = map.put(key, new 
ValueWrapper(value));
checkRemove();
return value;
}

void checkRemove() {
if (map.size() <= maxEntries)
return;
Map.Entry> eldestEntry = null;
for (Map.Entry> entry : map.entrySet()) {
long popularity = entry.getValue().popularity;
if (eldestEntry == null || 
eldestEntry.getValue().popularity > popularity) {
eldestEntry = entry;
}
}
map.remove(eldestEntry.getKey(), eldestEntry.getValue());
}

public V get(Object key) {
ValueWrapper wrapper = map.get(key);
return wrapper == null ? null : wrapper.getValue();
}

public final static class ValueWrapper {
static volatile long popularityCounter;
volatile long popularity;
V value;

ValueWrapper(V value) {
this.value = value;
popularity = popularityCounter++;
}

public boolean equals(Object o) {
if (!(o instanceof ValueWrapper)) {
return false;
}
return (value == null ? ((ValueWrapper) o).value == 
null : value.equals(((ValueWrapper) o).value));
}

public int hashCode() {
return value.hashCode();
}

public V getValue() {
popularity = popularityCounter++;
return value;
}

}

}
{code}

  was (Author: funtick):
This is extremely simple Concurrent LRU, I spent an hour to create it; it 
is based on ConcurrentHashMap. I don't use java.util.concurrent.locks, and I am 
trying to focus on _requirements only_ avoiding implementing unnecessary 
methods of Map interface (so that I am not following _contract_ ;) very sorry!)

{code}
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

public class ConcurrentLRU {

protected ConcurrentHashMap> map;
protected int maxEntries;

public ConcurrentLRU(int maxEntries) {
map = new ConcurrentHashMap>();
this.maxEntries = maxEntries;
}

public V put(K key, V value) {
ValueWrapper wrapper = map.put(key, new 
ValueWrapper(value));
checkRemove();
return value;
}

void checkRemove() {
if (map.size() <= maxEntries)
return;
Map.Entry> eldestEntry = null;
long eldestAge = Long.MAX_VALUE;
for (Map.Entry> entry : map.entrySet()) {
long popularity = entry.getValue().popularity;
if (eldestEntry == null || 
eldestEntry.getValue().popularity > popularity) {
eldestEntry = entry;
}
}
map.remove(eldestEntry.getKey(), eldestEntry.getValue());
}

public V get(Object key) {
ValueWrapper wrapper = map.get(key);
return wrapper == null ? null : wrapper.getValue();
}

public final static class ValueWrapper {
static volatile long popularityCounter;
volatile long popularity;
V value;

ValueWrapper(V value) {
this.value = value;
popularity = popularityCounter++;
}

public boolean equals(Object o) {
if (!(o instanceof ValueWrapper)) {
return false;
}
retur

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617639#action_12617639
 ] 

funtick edited comment on SOLR-665 at 7/28/08 6:28 PM:
---

This is extremely simple Concurrent LRU, I spent an hour to create it; it is 
based on ConcurrentHashMap. I don't use java.util.concurrent.locks, and I am 
trying to focus on _requirements only_ avoiding implementing unnecessary 
methods of Map interface (so that I am not following _contract_ ;) very sorry!)

{code}
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

public class ConcurrentLRU {

protected ConcurrentHashMap> map;
protected int maxEntries;

public ConcurrentLRU(int maxEntries) {
map = new ConcurrentHashMap>();
this.maxEntries = maxEntries;
}

public V put(K key, V value) {
ValueWrapper wrapper = map.put(key, new 
ValueWrapper(value));
checkRemove();
return value;
}

void checkRemove() {
if (map.size() <= maxEntries)
return;
Map.Entry> eldestEntry = null;
for (Map.Entry> entry : map.entrySet()) {
long popularity = entry.getValue().popularity;
if (eldestEntry == null || 
eldestEntry.getValue().popularity > popularity) {
eldestEntry = entry;
}
}
map.remove(eldestEntry.getKey(), eldestEntry.getValue());
}

public V get(Object key) {
ValueWrapper wrapper = map.get(key);
return wrapper == null ? null : wrapper.getValue();
}

public final static class ValueWrapper {
static volatile long popularityCounter;
volatile long popularity;
V value;

ValueWrapper(V value) {
this.value = value;
popularity = popularityCounter++;
}

public boolean equals(Object o) {
if (!(o instanceof ValueWrapper)) {
return false;
}
return (value == null ? ((ValueWrapper) o).value == 
null : value.equals(((ValueWrapper) o).value));
}

public int hashCode() {
return value.hashCode();
}

public V getValue() {
popularity = popularityCounter++;
return value;
}

}

}
{code}

P.S.
Do we need to synchronize put() or checkRemove()? The only hypothetical problem 
is OutOfMemoryException. But it is just first draft, very simplified... we do 
not need to sort array.

  was (Author: funtick):
This is extremely simple Concurrent LRU, I spent an hour to create it; it 
is based on ConcurrentHashMap. I don't use java.util.concurrent.locks, and I am 
trying to focus on _requirements only_ avoiding implementing unnecessary 
methods of Map interface (so that I am not following _contract_ ;) very sorry!)

{code}
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

public class ConcurrentLRU {

protected ConcurrentHashMap> map;
protected int maxEntries;

public ConcurrentLRU(int maxEntries) {
map = new ConcurrentHashMap>();
this.maxEntries = maxEntries;
}

public V put(K key, V value) {
ValueWrapper wrapper = map.put(key, new 
ValueWrapper(value));
checkRemove();
return value;
}

void checkRemove() {
if (map.size() <= maxEntries)
return;
Map.Entry> eldestEntry = null;
for (Map.Entry> entry : map.entrySet()) {
long popularity = entry.getValue().popularity;
if (eldestEntry == null || 
eldestEntry.getValue().popularity > popularity) {
eldestEntry = entry;
}
}
map.remove(eldestEntry.getKey(), eldestEntry.getValue());
}

public V get(Object key) {
ValueWrapper wrapper = map.get(key);
return wrapper == null ? null : wrapper.getValue();
}

public final static class ValueWrapper {
static volatile long popularityCounter;
volatile long popularity;
V value;

ValueWrapper(V value) {
this.value = value;
popularity = popularityCounter++;
}

public boolean equals(Object o) {

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617646#action_12617646
 ] 

Yonik Seeley commented on SOLR-665:
---

Fuad, you are on the right track now by using something that is thread-safe 
(ConcurrentHashMap).  A couple of minor points:
- I don't see the point of the static popularityCounter... that looks like a 
bug.
- increments aren't atomic, but losing an increment once in a while should be 
fine in this scenario

Anyway, If this works for you, use it!
It's likely to be *very* specific to your use-case though (with millions of 
cache entries, millions of lookups per request, and no evictions).  The 
eviction code looks like it would be relatively expensive.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Lars Kotthoff (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617648#action_12617648
 ] 

Lars Kotthoff commented on SOLR-665:


bq. Not everything in JAVA is extremely good: for instance, synchronization. 
Even for single-threaded application, it needs additionally 600 CPU cycles 
(I've read it somewhere for SUN Java 1.3 on Windows)

That's probably not true for modern JVMs though -- cf. 
http://www.ibm.com/developerworks/library/j-jtp04223.html.

bq. ...9x times performance boost...

How did you measure that exactly? The Solr admin pages will *not* give you 
exact measurements. Could you describe the test setup in detail? I'm guessing 
that you're caching the results of all queries in memory such that no disk 
access is necessary. Are you using highlighting or anything else that might be 
CPU-intensive at all? From my personal experience with Solr I wouldn't expect 
synchronization for the caches to be that big of a performance penalty. In some 
of my tests with a several GB index where all results where cached and 
highlighting was turned on I've seen throughputs in excess of 400 searches per 
second. I think that the performance bottleneck in this case was the network 
interface for sending the replies.

bq. absolutely no need to synchronize get() method for FIFO!

Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined. Yes, we could do sanity checks 
to minimise these cases, but that would probably end up being more expensive 
than the synchronization.

bq. From JavaDoc: "Note that this implementation is not synchronized. If 
multiple threads access a linked hash map concurrently, and at least
one of the threads modifies the map structurally, it must be synchronized 
externally."

That's exactly the case here -- the update thread modifies the map 
structurally! It doesn't do this at all times, probably even never after the 
cache has been populated, but there's no way to *know* for sure unless you 
*explicitely* remove the put method.

I'm not convinced that we should change the current implementation for the 
following reasons:
* Concurrency is traditionally a discipline which is very hard to get right. 
Furthermore the serious bugs tend to show up only when you really get race 
conditions and the like, i.e. when the machine is under heavy load and any 
disruption will hit you seriously.
* You've already started to amend your implementation with sanity checks and 
the like -- as I've said before, this might end up being more expensive than 
synchronization.
* A FIFO cache might become a bottleneck itself -- if the cache is very large 
and the most frequently accessed item is inserted just after the cache is 
created, all accesses will need to traverse all the other entries before 
getting that item.

That said, if you can show conclusively (e.g. with a profiler) that the 
synchronized access is indeed the bottleneck and incurs a heavy penalty on 
performance, then I'm all for investigating this further.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions 

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617654#action_12617654
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617655#action_12617655
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. The eviction code looks like it would be relatively expensive

but get() method of LinkedHashMap reorders whole map!!! (Of course, CPU load is 
evenly distributed between several get() so that we can't see it) Other 
implementations even use Arrays.sort() or something similar. I don't see easier 
solution than that... probably some random-access policy with predictable range 
of "popularity", we can evict anything 'old' and not necessarily 'eldest'...


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617656#action_12617656
 ] 

Yonik Seeley commented on SOLR-665:
---

bq. but get() method of LinkedHashMap reorders whole map!!! 

No it doesn't... think linked-list.  It moves a single item, which is pretty 
fast.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617657#action_12617657
 ] 

Fuad Efendi commented on SOLR-665:
--

Lars, I used FIFO because it is extremely simple to get unsynchronized _get()_:
{code}
map = new LinkedHashMap(initialSize, 0.75f, true)  - LRU Cache
(and we need synchronized get())
map = new LinkedHashMap(initialSize, 0.75f, false) - FIFO
(and we do not need synchronized get()) 
{code}

Yonik, I'll try to improve ConcurrentLRU and to share findings... of course 
FIFO is not what we need.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Lars Kotthoff (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617658#action_12617658
 ] 

Lars Kotthoff commented on SOLR-665:


bq. the returned value is well defined: it is either null or correct value.

It is if nothing is modifying the map during the get. If something is modifying 
the map you don't know how the implementation handles the insert of a new 
value. It might copy the object, and you'd end up with half an object or even 
an invalid memory location. That's why the javadoc says that you *must* 
synchronize accesses if *anything* modifies the map -- this is not limited to 
iterators.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617657#action_12617657
 ] 

funtick edited comment on SOLR-665 at 7/28/08 7:36 PM:
---

Lars, I used FIFO because it is extremely simple to get unsynchronized _get()_:
{code}
map = new LinkedHashMap(initialSize, 0.75f, true)  - LRU Cache
(and we need synchronized get())
map = new LinkedHashMap(initialSize, 0.75f, false) - FIFO
(and we do not need synchronized get()) 
{code}

Yonik, I'll try to improve ConcurrentLRU and to share findings... of course 
FIFO is not what we need.

bq. No it doesn't... think linked-list. It moves a single item, which is pretty 
fast.
yes, so I wrote 'evenly distributed between several get() so we can't see it' - 
it keeps List ordered and we can't unsynchronize it with all subsequences!!!

  was (Author: funtick):
Lars, I used FIFO because it is extremely simple to get unsynchronized 
_get()_:
{code}
map = new LinkedHashMap(initialSize, 0.75f, true)  - LRU Cache
(and we need synchronized get())
map = new LinkedHashMap(initialSize, 0.75f, false) - FIFO
(and we do not need synchronized get()) 
{code}

Yonik, I'll try to improve ConcurrentLRU and to share findings... of course 
FIFO is not what we need.
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617662#action_12617662
 ] 

Yonik Seeley commented on SOLR-665:
---

bq. the returned value is well defined: it is either null or correct value.

No, it is not!  Your analysis seems to ignore the java memory model (partially 
constructed objects and all that).  I don't know how many different ways to say 
it please do yourself a favor and read up on the java memory model (and the 
book I previously referenced is great for this).  This is hard stuff (at the 
lowest levels).

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617664#action_12617664
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. It is if nothing is modifying the map during the get. If something is 
modifying the map you don't know how the implementation handles the insert of a 
new value. It might copy the object, and you'd end up with half an object or 
even an invalid memory location. That's why the javadoc says that you must 
synchronize accesses if anything modifies the map - this is not limited to 
iterators.

JavaDoc does not say that. Instead, it says (I am repeating):
bq. ..and at least one of the threads modifies the map structurally, it must be 
synchronized externally. 

- only thread doing  structural modification must be synchronized. In case of 
LinkedHashMap, for instance, we need to synchronize inserts in order to avoid 
Entry instances referencing themselves (orphans).


bq. you don't know how the implementation handles the insert of a new value

I know exactly: SOLR does not modify 'value' during 'insert', Map.Entry 
instances are immutable in SOLR, etc. Table resize is main problem - but after 
analyzing source code I don't see any problem. Consern that 'wrong value will 
be returned for a key' is not applicable. And JavaDocs does not say anything 
about that. Collections internally use Map.Entry in an immutable way, do not 
change it.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617668#action_12617668
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. No, it is not! Your analysis seems to ignore the java memory model 
(partially constructed objects and all that). I don't know how many different 
ways to say it please do yourself a favor and read up on the java memory 
model (and the book I previously referenced is great for this). This is hard 
stuff (at the lowest levels).

Ok. May be I can get reference to wrong object type, or even object scheduled 
for finalization... But we are not inserting into Map 'partially constructed 
objects', isn't it?

Simplest scenario: Thread A tries to get variable (4 bytes of address in JVM) 
pointing to object O. Another thread B concurrently assigns _null_ to that 
variable. Isn't it solved at CPU level yet? Or, may be on 64bit system thread B 
assigns zero to first 2 bytes, and then to another 2 bytes?

I need to study this book... BTW, I am running JVM with '-server' option.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Lars Kotthoff (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617669#action_12617669
 ] 

Lars Kotthoff commented on SOLR-665:


The way I understand the javadoc is

bq. ..and at least one of the threads modifies the map structurally, *it* must 
be synchronized externally.

The *it* references the map. It says explicitely that the *map* must be 
synchronized, i.e. all operations on it. It does not say that only the thread 
modifying it must be synchronized. Consider the case where only one thread 
modifies the map. You're saying that in this case synchronization would not be 
necessary, as only modifications themselves need to be synchronized and there 
is only one thread doing that. The javadoc explicitely says that it is 
necessary.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617673#action_12617673
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. The it references the map. It says explicitely that the map must be 
synchronized.

- I agree, thanks for pointing to it. Synchronize!

BTW, Joshua Bloch developed Arrays.sort(), and bug was found after 9 years. 
Nothing is perfect.

ConcurrentLRU looks extremely simple and easy to improve. Should we check SUN's 
bug database before using ConcurrentHashMap? It has some related...

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-614) Allow components to read any kind of XML from solrconfig

2008-07-28 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617674#action_12617674
 ] 

Ryan McKinley commented on SOLR-614:


As i have said, I think consistency is a good thing and we will hopefully get 
out of the custom config format business in 2.0.  But I also see that having 
ugly configs makes it difficult to be clear about what it does.  (I remember 
struggling to figure out that lst != 1st)  Ugly configs are a big deal, so I 
hate to throw sticks at the endeavor...  I imagine any translation to a new 
format would involve reading it and then outputting the relevant configs rather 
then trying some sort of text manipulation.  With that in mind, it probably 
makes little difference on that front.

So I'll change my vote to -0, and I'll let you all sort out what should 
happen...



> Allow components to read any kind of XML from solrconfig
> 
>
> Key: SOLR-614
> URL: https://issues.apache.org/jira/browse/SOLR-614
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
>Priority: Trivial
> Fix For: 1.3
>
> Attachments: SOLR-614.patch, SOLR-614.patch, SOLR-614.patch, 
> SOLR-614.patch
>
>
> All the components initialized by Solr have an init(NamedList args) 
> initializer. This leads us to writing the configuration needed for the 
> component in the NamedList xml format. People familiar with Solr may know the 
> format but most of what is written is noise than information. For users who 
> are not familiar w/ the format find it too difficult to understand why they 
> have to write it this way. Moreover , it is not a very efficient way to 
> configure .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617654#action_12617654
 ] 

funtick edited comment on SOLR-665 at 7/28/08 8:36 PM:
---

bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 


  was (Author: funtick):
bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 

  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617654#action_12617654
 ] 

funtick edited comment on SOLR-665 at 7/28/08 8:41 PM:
---

bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}

bq. A FIFO cache might become a bottleneck itself - if the cache is very large 
and the most frequently accessed item is inserted just after the cache is 
created, all accesses will need to traverse all the other entries before 
getting that item.

- sorry, I didn't understand... yes, if cache contains 10 entries and 'most 
popular item' is removed... Why 'traverse all the other entries before getting 
that item'? why 9 items are less popular (cumulative) than single one 
(absolute)?


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 


  was (Author: funtick):
bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 

  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Craig McClanahan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617676#action_12617676
 ] 

Craig McClanahan commented on SOLR-665:
---

Just a quick comment from the Peanut Gallery.

Instead of arguing about what the semantics of what the JavaDocs for 
java.util.Map and friends actually say, and worrying about whether particular 
implementations actually follow the rules but provide reasonable performance, 
this seems like a good opportunity to design and build an application specific 
data structure that has the behavioral characteristics you want (Fuad is after 
fastest-possible reads, everybody is after *reasonable* behavior in the face of 
concurrent writes).  My only recommendation in this regard would be this:  
don't make a custom implementation say "implements java.util.LinkedHashMap" 
(again, or whatever implementation is currently in use) unless it does actually 
implement the documented semantics for java.util.LinkedHashMap.

It's fine to have a custom FifoCacheMap (or whatever name you like) class that 
does not implement java.util.Map.  It's *not* fine to say you implement an 
interface but then break the documented contract.


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617654#action_12617654
 ] 

funtick edited comment on SOLR-665 at 7/28/08 8:46 PM:
---

bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}

bq. A FIFO cache might become a bottleneck itself - if the cache is very large 
and the most frequently accessed item is inserted just after the cache is 
created, all accesses will need to traverse all the other entries before 
getting that item.

- sorry, I didn't understand... yes, if cache contains 10 entries and 'most 
popular item' is removed... Why 'traverse all the other entries before getting 
that item'? why 9 items are less popular (cumulative) than single one 
(absolute)?

You probably mean 'LinkedList traversal' but this is not the case. This is why 
we need to browse JavaSource... LinkedHashMap extends HashMap and there is no 
any 'traversal',
{code}
public V get(Object key) {
Entry e = (Entry)getEntry(key);
if (e == null)
return null;
e.recordAccess(this);
return e.value;
}
{code}


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 


  was (Author: funtick):
bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}

bq. A FIFO cache might become a bottleneck itself - if the cache is very large 
and the most frequently accessed item is inserted just after the cache is 
created, all accesses will need to traverse all the other entries before 
getting that item.

- sorry, I didn't understand... yes, if cache contains 10 entries and 'most 
popular item' is removed... Why 'traverse all the other entries before getting 
that item'? why 9 items are less popular (cumulative) than single one 
(absolute)?


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popul

Re: [jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Yonik Seeley
Fuad, please stop editing entries that have already have responses...
it makes it very difficult to follow things.

-Yonik


[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617654#action_12617654
 ] 

funtick edited comment on SOLR-665 at 7/28/08 8:48 PM:
---

bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}

bq. A FIFO cache might become a bottleneck itself - if the cache is very large 
and the most frequently accessed item is inserted just after the cache is 
created, all accesses will need to traverse all the other entries before 
getting that item.

- sorry, I didn't understand... yes, if cache contains 10 entries and 'most 
popular item' is removed... Why 'traverse all the other entries before getting 
that item'? why 9 items are less popular (cumulative) than single one 
(absolute)?

You probably mean 'LinkedList traversal' but this is not the case. This is why 
we need to browse JavaSource... LinkedHashMap extends HashMap and there is no 
any 'traversal',
{code}
public V get(Object key) {
Entry e = (Entry)getEntry(key);
if (e == null)
return null;
e.recordAccess(this);
return e.value;
}
{code}


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

bq. That said, if you can show conclusively (e.g. with a profiler) that the 
synchronized access is indeed the bottleneck and incurs a heavy penalty on 
performance, then I'm all for investigating this further.

*What?!!*


Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 


  was (Author: funtick):
bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}

bq. A FIFO cache might become a bottleneck itself - if the cache is very large 
and the most frequently accessed item is inserted just after the cache is 
created, all accesses will need to traverse all the other entries before 
getting that item.

- sorry, I didn't understand... yes, if cache contains 10 entries and 'most 
popular item' is removed... Why 'traverse all the other entries before getting 
that item'? why 9 items are less popular (cumulative) than single one 
(absolute)?

You probably mean 'LinkedList traversal' but this is not the case. This is why 
we need to browse JavaSource... LinkedHashMap extends HashMap and there is no 
any 'traversal',
{code}
public V get(Object key) {
Entry e = (Entry)getEntry(key);
if (e == null)
return null;
e.recordAccess(this);
retur

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617682#action_12617682
 ] 

Yonik Seeley commented on SOLR-665:
---

bq. It's fine to have a custom FifoCacheMap (or whatever name you like) class 
that does not implement java.util.Map.

Right.  Any solr cache must currently implement the SolrCache interface, and 
LinkedHashMap is simply an implementation detail of the LRUCache implementation 
of SolrCache (which has no relationship to the Map interface).

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617684#action_12617684
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. Fuad is after fastest-possible reads, everybody is after reasonable 
behavior in the face of concurrent writes

Thanks and sorry for runtime errors;

FIFO looks strange at first, but... for large cache (10 items), most 
popular item can be _mistakenly_ removed... but I don't think there are any 
'most popular facets' etc.; it's evenly distributed in most cases.

Another issue: SOLR always tries _recalculate_ _facets_ even with extremely 
large filterCache & queryResultCache, even the same faceted query shows always 
the same long response times.


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-614) Allow components to read any kind of XML from solrconfig

2008-07-28 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617688#action_12617688
 ] 

Noble Paul commented on SOLR-614:
-

bq... if we make this change, then someone who is already familiar with the way 
handlers are configured won't understand the "/replication" example Noble posted
the feature that is already under development may have its syntax not yet 
finalized . So when we eventually commit it, it may have a different syntax. 

bq.There is a lot of value in maintaining consistency - even if it's ugly.

We do not have consistency in the way multiple components are configured. eg: 
UpdateHandler, UpdateProcessorChain, mainIndex etc.  

bq.the fact remains that if a single type of plugin (ie: RequestHandler) can 
support multiple config syntaxes the potential exists for people to get very 
confused 

We must not support multiple formats . We must stick to one an we will support 
only one .

bq.Looking ahead two or three moves: adding support for something like this now 
would also probably make it that much harder to write a "converter" for 
existing solr config files if/when we switch to Spring or some other Java 
object wiring/configuration system. 

We are trivializing a config format switch . It is not going to be as simple as 
writing a simple converter. We may need total rewiring of components which may 
involve code modification for all the components. 

> Allow components to read any kind of XML from solrconfig
> 
>
> Key: SOLR-614
> URL: https://issues.apache.org/jira/browse/SOLR-614
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>Reporter: Noble Paul
>Assignee: Shalin Shekhar Mangar
>Priority: Trivial
> Fix For: 1.3
>
> Attachments: SOLR-614.patch, SOLR-614.patch, SOLR-614.patch, 
> SOLR-614.patch
>
>
> All the components initialized by Solr have an init(NamedList args) 
> initializer. This leads us to writing the configuration needed for the 
> component in the NamedList xml format. People familiar with Solr may know the 
> format but most of what is written is noise than information. For users who 
> are not familiar w/ the format find it too difficult to understand why they 
> have to write it this way. Moreover , it is not a very efficient way to 
> configure .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617684#action_12617684
 ] 

funtick edited comment on SOLR-665 at 7/28/08 9:20 PM:
---

bq. Fuad is after fastest-possible reads, everybody is after reasonable 
behavior in the face of concurrent writes

Thanks and sorry for runtime errors;

FIFO looks strange at first, but... for large cache (10 items), most 
popular item can be _mistakenly_ removed... but I don't think there are any 
'most popular facets' etc.; it's evenly distributed in most cases.

Another issue: SOLR always tries _recalculate_ _facets_ even with extremely 
large filterCache & queryResultCache, even the same faceted query shows always 
the same long response times.

bq. It is if nothing is modifying the map during the get. If something is 
modifying the map you don't know how the implementation handles the insert of a 
new value. It might copy the object, and you'd end up with half an object or 
even an invalid memory location. That's why the javadoc says that you must 
synchronize accesses if anything modifies the map - this is not limited to 
iterators.

I agree of course... However, we are not dealing with unknown implementation of 
java.util.Map clonig (java.lang.Cloneable) objects somehow or using some weird 
object introspection etc 

  was (Author: funtick):
bq. Fuad is after fastest-possible reads, everybody is after reasonable 
behavior in the face of concurrent writes

Thanks and sorry for runtime errors;

FIFO looks strange at first, but... for large cache (10 items), most 
popular item can be _mistakenly_ removed... but I don't think there are any 
'most popular facets' etc.; it's evenly distributed in most cases.

Another issue: SOLR always tries _recalculate_ _facets_ even with extremely 
large filterCache & queryResultCache, even the same faceted query shows always 
the same long response times.

  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617692#action_12617692
 ] 

Fuad Efendi commented on SOLR-665:
--

BTW there is almost no any functional difference between LRU and FIFO. And 
there is *huge difference* between LRU (Least Recently Used) and LFU (Least 
Frequently Used).
It's easy to implement ConcurrentLFU based on provided ConcurrentLRU template; 
of course, following the main _contract_ org.apache.solr.search.SolrCache.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-433) MultiCore and SpellChecker replication

2008-07-28 Thread Jeremy Hinegardner (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617694#action_12617694
 ] 

Jeremy Hinegardner commented on SOLR-433:
-

If this patch works for folks, I would like to see it committed and put in the 
nightly snapshot.  Or at least the RunExecutableListener.patch and 
solr-433.patch.

If there is any more work here, I'd be happy to work on it.  

> MultiCore and SpellChecker replication
> --
>
> Key: SOLR-433
> URL: https://issues.apache.org/jira/browse/SOLR-433
> Project: Solr
>  Issue Type: Improvement
>  Components: replication, spellchecker
>Affects Versions: 1.3
>Reporter: Otis Gospodnetic
> Attachments: RunExecutableListener.patch, solr-433.patch, 
> spellindexfix.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be 
> able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their 
> index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - 
> http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just 
> its index.  And the spellchecker could then also have its data dir (and only 
> index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, 
> then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put 
> multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



  1   2   >