from:"Fuad Efendi"


 [ 
https://issues.apache.org/jira/browse/SOLR-711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-711:
-

Description: 
>From 
>[http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html]:

Scenario:
- 10,000,000 documents in the index; 
- 5-10 terms per document; 
- 200,000 unique terms for a tokenized field. 

_Obviously calculating sizes of 200,000 intersections with FilterCache is 100 
times slower than traversing 10 - 20,000 documents for smaller DocSets and 
counting frequencies of Terms._

Not applicable if size of DocSet is close to total number of unique tokens 
(200,000 in our scenario).

See   SimpleFacets:
{code:title=SimpleFacets.java|borderStyle=solid}
public NamedList getFacetTermEnumCounts(
  SolrIndexSearcher searcher, 
  DocSet docs, ...
{code}




  was:
>From 
>[url]http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html[/url]:

Scenario:
- 10,000,000 documents in the index; 
- 5-10 terms per document; 
- 200,000 unique terms for a tokenized field. 

_Obviously calculating sizes of 200,000 intersections with FilterCache is 100 
times slower than traversing 10 - 20,000 documents for smaller DocSets and 
counting frequencies of Terms._

Not applicable if size of DocSet is close to total number of unique tokens 
(200,000 in our scenario).

See   SimpleFacets:
 {{
public NamedList getFacetTermEnumCounts(
  SolrIndexSearcher searcher, 
  DocSet docs, 
  String field, 
  int offset, 
  int limit, 
  int mincount, 
  boolean missing, 
  boolean sort, 
  String prefix)
throws IOException {...}
}}





trivial formatting

> SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using 
> Term Vectors
> --
>
> Key: SOLR-711
> URL: https://issues.apache.org/jira/browse/SOLR-711
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.3
>Reporter: Fuad Efendi
> Fix For: 1.4
>
>   Original Estimate: 1680h
>  Remaining Estimate: 1680h
>
> From 
> [http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html]:
> Scenario:
> - 10,000,000 documents in the index; 
> - 5-10 terms per document; 
> - 200,000 unique terms for a tokenized field. 
> _Obviously calculating sizes of 200,000 intersections with FilterCache is 100 
> times slower than traversing 10 - 20,000 documents for smaller DocSets and 
> counting frequencies of Terms._
> Not applicable if size of DocSet is close to total number of unique tokens 
> (200,000 in our scenario).
> See   SimpleFacets:
> {code:title=SimpleFacets.java|borderStyle=solid}
> public NamedList getFacetTermEnumCounts(
>   SolrIndexSearcher searcher, 
>   DocSet docs, ...
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-711) SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using Term Vectors


 [ 
https://issues.apache.org/jira/browse/SOLR-711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-711:
-

Description: 
>From 
>[http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html]:

Scenario:
- 10,000,000 documents in the index; 
- 5-10 terms per document; 
- 200,000 unique terms for a tokenized field. 

_Obviously calculating sizes of 200,000 intersections with FilterCache is 100 
times slower than traversing 10 - 20,000 documents for smaller DocSets and 
counting frequencies of Terms._

Not applicable if size of DocSet is close to total number of unique tokens 
(200,000 in our scenario).

See   SimpleFacets.java:
{code}
public NamedList getFacetTermEnumCounts(
  SolrIndexSearcher searcher, 
  DocSet docs, ...
{code}




  was:
>From 
>[http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html]:

Scenario:
- 10,000,000 documents in the index; 
- 5-10 terms per document; 
- 200,000 unique terms for a tokenized field. 

_Obviously calculating sizes of 200,000 intersections with FilterCache is 100 
times slower than traversing 10 - 20,000 documents for smaller DocSets and 
counting frequencies of Terms._

Not applicable if size of DocSet is close to total number of unique tokens 
(200,000 in our scenario).

See   SimpleFacets:
{code:title=SimpleFacets.java|borderStyle=solid}
public NamedList getFacetTermEnumCounts(
  SolrIndexSearcher searcher, 
  DocSet docs, ...
{code}





> SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using 
> Term Vectors
> --
>
> Key: SOLR-711
> URL: https://issues.apache.org/jira/browse/SOLR-711
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.3
>Reporter: Fuad Efendi
> Fix For: 1.4
>
>   Original Estimate: 1680h
>  Remaining Estimate: 1680h
>
> From 
> [http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html]:
> Scenario:
> - 10,000,000 documents in the index; 
> - 5-10 terms per document; 
> - 200,000 unique terms for a tokenized field. 
> _Obviously calculating sizes of 200,000 intersections with FilterCache is 100 
> times slower than traversing 10 - 20,000 documents for smaller DocSets and 
> counting frequencies of Terms._
> Not applicable if size of DocSet is close to total number of unique tokens 
> (200,000 in our scenario).
> See   SimpleFacets.java:
> {code}
> public NamedList getFacetTermEnumCounts(
>   SolrIndexSearcher searcher, 
>   DocSet docs, ...
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-711) SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using Term Vectors


 [ 
https://issues.apache.org/jira/browse/SOLR-711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-711:
-

Comment: was deleted

> SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using 
> Term Vectors
> --
>
> Key: SOLR-711
> URL: https://issues.apache.org/jira/browse/SOLR-711
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.3
>Reporter: Fuad Efendi
> Fix For: 1.4
>
>   Original Estimate: 1680h
>  Remaining Estimate: 1680h
>
> From 
> [http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html]:
> Scenario:
> - 10,000,000 documents in the index; 
> - 5-10 terms per document; 
> - 200,000 unique terms for a tokenized field. 
> _Obviously calculating sizes of 200,000 intersections with FilterCache is 100 
> times slower than traversing 10 - 20,000 documents for smaller DocSets and 
> counting frequencies of Terms._
> Not applicable if size of DocSet is close to total number of unique tokens 
> (200,000 in our scenario).
> See   SimpleFacets:
> {code:title=SimpleFacets.java|borderStyle=solid}
> public NamedList getFacetTermEnumCounts(
>   SolrIndexSearcher searcher, 
>   DocSet docs, ...
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-711) SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using Term Vectors

SimpleFacets: Performance Boost for Tokenized Fields for smaller DocSet using 
Term Vectors
--

 Key: SOLR-711
 URL: https://issues.apache.org/jira/browse/SOLR-711
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.3
Reporter: Fuad Efendi
 Fix For: 1.4


>From 
>[url]http://www.nabble.com/SimpleFacets%3A-Performance-Boost-for-Tokenized-Fields-td19033760.html[/url]:

Scenario:
- 10,000,000 documents in the index; 
- 5-10 terms per document; 
- 200,000 unique terms for a tokenized field. 

_Obviously calculating sizes of 200,000 intersections with FilterCache is 100 
times slower than traversing 10 - 20,000 documents for smaller DocSets and 
counting frequencies of Terms._

Not applicable if size of DocSet is close to total number of unique tokens 
(200,000 in our scenario).

See   SimpleFacets:
 {{
public NamedList getFacetTermEnumCounts(
  SolrIndexSearcher searcher, 
  DocSet docs, 
  String field, 
  int offset, 
  int limit, 
  int mincount, 
  boolean missing, 
  boolean sort, 
  String prefix)
throws IOException {...}
}}




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Only 3 issues left

2008-08-18 Thread Fuad Efendi

such as... using Term Vectors for fast faceting on tokenized... but it  
is in TODOs of source files!
 cNET data was so small when SimpleFacets were born, only  
40 docs...


Quoting Shalin Shekhar Mangar <[EMAIL PROTECTED]>:


Only 3.so alienmust...add..more...issues...argh!

On Mon, Aug 18, 2008 at 9:20 PM, Otis Gospodnetic <
[EMAIL PROTECTED]> wrote:


Hi

Look mom, only 3 issues to go!


https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&mode=hide&sorter/order=DESC&sorter/field=priority&resolution=-1&pid=12310230&fixfor=12312486

Out of those, 1 is trivial (lucene jar update), 1 looks committed (the
maven one), and only SOLR-646 is "serious".


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch





--
Regards,
Shalin Shekhar Mangar.

[jira] Updated: (SOLR-671) Range queries with 'slong' field type do not retrieve correct results


 [ 
https://issues.apache.org/jira/browse/SOLR-671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-671:
-

 Priority: Trivial  (was: Major)
   Issue Type: Test  (was: Bug)
Affects Version/s: (was: 1.3)

> Range queries with 'slong' field type do not retrieve correct results
> -
>
> Key: SOLR-671
> URL: https://issues.apache.org/jira/browse/SOLR-671
> Project: Solr
>  Issue Type: Test
> Environment: SOLR-1.3-DEV 
> Schema:
>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>
>Reporter: Fuad Efendi
>Priority: Trivial
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Range queries always return all results (do not filter):
> timestamp:[1019386401114 TO 1219386401114]
> 
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[#8;#0;εごᅚ TO #8;#0;ѯ刯慚]
> ...
> OldLuceneQParser

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-671) Range queries with 'slong' field type do not retrieve correct results


[ 
https://issues.apache.org/jira/browse/SOLR-671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619227#action_12619227
 ] 

Fuad Efendi commented on SOLR-671:
--

{code}
long time1 = System.currentTimeMillis();
long time2 = 30*24*3600*1000;
long time3 = time1 - time2;
System.out.println("Time1: "+time1);
System.out.println("Time2: " +time2);
System.out.println("Time3: "+time3);

Time1: 1217686478242
Time2: -1702967296
Time3: 1219389445538
{code}

bug is obvious...

{code}
long time1 = System.currentTimeMillis();
long time2 = 30*24*3600*1000L;
long time3 = time1 - time2;
System.out.println("Time1: "+time1);
System.out.println("Time2: " +time2);
System.out.println("Time3: "+time3);

Time1: 1217686559557
Time2: 259200
Time3: 1215094559557
{code}


Close it...



> Range queries with 'slong' field type do not retrieve correct results
> -
>
> Key: SOLR-671
> URL: https://issues.apache.org/jira/browse/SOLR-671
> Project: Solr
>  Issue Type: Bug
> Environment: SOLR-1.3-DEV 
> Schema:
>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>
>Reporter: Fuad Efendi
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Range queries always return all results (do not filter):
> timestamp:[1019386401114 TO 1219386401114]
> 
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[#8;#0;εごᅚ TO #8;#0;ѯ刯慚]
> ...
> OldLuceneQParser

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-671) Range queries with 'slong' field type do not retrieve correct results


[ 
https://issues.apache.org/jira/browse/SOLR-671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619223#action_12619223
 ] 

funtick edited comment on SOLR-671 at 8/2/08 7:12 AM:
--

Here is test case, similar to Arrays.sort() bug (unsigned...):

{code}

long time1 = System.currentTimeMillis();
long time2 = 30*24*3600*1000;
System.out.println(time1);
System.out.println(time1-time2);

Output:
1219389000674
1221091967970
{code}

(time1-time2) > time1!

What happens inside SOLR slong for such queries?


  was (Author: funtick):
Here is test case, similar to Arrays.sort() bug (unsigned...):

{code}
long time1 = System.currentTimeMillis() - 30*24*3600*1000;
long time2 = 30*24*3600*1000;
System.out.println(time1);
System.out.println(time1-time2);

Output:
1219389000674
1221091967970
{code}

(time1-time2) > time1!

What happens inside SOLR slong for such queries?

  
> Range queries with 'slong' field type do not retrieve correct results
> -
>
> Key: SOLR-671
> URL: https://issues.apache.org/jira/browse/SOLR-671
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: SOLR-1.3-DEV 
> Schema:
>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>
>Reporter: Fuad Efendi
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Range queries always return all results (do not filter):
> timestamp:[1019386401114 TO 1219386401114]
> 
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[#8;#0;εごᅚ TO #8;#0;ѯ刯慚]
> ...
> OldLuceneQParser

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-671) Range queries with 'slong' field type do not retrieve correct results


 [ 
https://issues.apache.org/jira/browse/SOLR-671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-671:
-

  Priority: Major  (was: Trivial)
Issue Type: Bug  (was: Test)

Here is test case, similar to Arrays.sort() bug (unsigned...):

{code}
long time1 = System.currentTimeMillis() - 30*24*3600*1000;
long time2 = 30*24*3600*1000;
System.out.println(time1);
System.out.println(time1-time2);

Output:
1219389000674
1221091967970
{code}

(time1-time2) > time1!

What happens inside SOLR slong for such queries?


> Range queries with 'slong' field type do not retrieve correct results
> -
>
> Key: SOLR-671
> URL: https://issues.apache.org/jira/browse/SOLR-671
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: SOLR-1.3-DEV 
> Schema:
>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>
>Reporter: Fuad Efendi
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Range queries always return all results (do not filter):
> timestamp:[1019386401114 TO 1219386401114]
> 
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[#8;#0;εごᅚ TO #8;#0;ѯ刯慚]
> ...
> OldLuceneQParser

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-671) Range queries with 'slong' field type do not retrieve correct results


 [ 
https://issues.apache.org/jira/browse/SOLR-671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-671:
-

  Priority: Trivial  (was: Blocker)
Issue Type: Test  (was: Bug)

I executed another query which works fine:
timestamp:[* TO 1000] - 0 results
Finally found it works...

Please close.

> Range queries with 'slong' field type do not retrieve correct results
> -
>
> Key: SOLR-671
> URL: https://issues.apache.org/jira/browse/SOLR-671
> Project: Solr
>  Issue Type: Test
>Affects Versions: 1.3
> Environment: SOLR-1.3-DEV 
> Schema:
>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>
>Reporter: Fuad Efendi
>Priority: Trivial
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Range queries always return all results (do not filter):
> timestamp:[1019386401114 TO 1219386401114]
> 
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[#8;#0;εごᅚ TO #8;#0;ѯ刯慚]
> ...
> OldLuceneQParser

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-671) Range queries with 'slong' field type do not retrieve correct results


 [ 
https://issues.apache.org/jira/browse/SOLR-671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-671:
-

 Priority: Blocker  (was: Major)
Affects Version/s: 1.3

> Range queries with 'slong' field type do not retrieve correct results
> -
>
> Key: SOLR-671
> URL: https://issues.apache.org/jira/browse/SOLR-671
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
> Environment: SOLR-1.3-DEV 
> Schema:
>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>  sortMissingLast="true" omitNorms="true"/>
>
>Reporter: Fuad Efendi
>Priority: Blocker
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Range queries always return all results (do not filter):
> timestamp:[1019386401114 TO 1219386401114]
> 
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[1019386401114 TO 1219386401114]
> timestamp:[#8;#0;εごᅚ TO #8;#0;ѯ刯慚]
> ...
> OldLuceneQParser

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-671) Range queries with 'slong' field type do not retrieve correct results

Range queries with 'slong' field type do not retrieve correct results
-

 Key: SOLR-671
 URL: https://issues.apache.org/jira/browse/SOLR-671
 Project: Solr
  Issue Type: Bug
 Environment: SOLR-1.3-DEV 

Schema:

   





   


Reporter: Fuad Efendi


Range queries always return all results (do not filter):

timestamp:[1019386401114 TO 1219386401114]



timestamp:[1019386401114 TO 1219386401114]
timestamp:[1019386401114 TO 1219386401114]
timestamp:[1019386401114 TO 1219386401114]
timestamp:[#8;#0;εごᅚ TO #8;#0;ѯ刯慚]

...

OldLuceneQParser

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-08-01 Thread Fuad Efendi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619058#action_12619058
 ] 

Fuad Efendi commented on SOLR-665:
--

Guys at LingPipe (Natural Language Processing) http://alias-i.com/ are using 
excellent Map implementations with optimistic concurrency strategy:
http://alias-i.com/lingpipe/docs/api/com/aliasi/util/FastCache.html
http://alias-i.com/lingpipe/docs/api/com/aliasi/util/HardFastCache.html


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, 
> SimplestConcurrentLRUCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-667) Alternate LRUCache implementation


[ 
https://issues.apache.org/jira/browse/SOLR-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618824#action_12618824
 ] 

Fuad Efendi commented on SOLR-667:
--

Thanks Yonik, I even guess that in some cases synchronization is faster than 
sun.misc.Unsafe.compareAndSwapLong(this, valueOffset, expect, update);

{code}
public final long incrementAndGet() {
for (;;) {
long current = get();
long next = current + 1;
if (compareAndSet(current, next))
return next;
}
}
{code}

- extremal level of safety with some level of concurrency... Do we need exact 
value for 'stats.inserts' (if it is not synchronized)? 

It can be 'long' inside synchronized block...


> Alternate LRUCache implementation
> -
>
> Key: SOLR-667
> URL: https://issues.apache.org/jira/browse/SOLR-667
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Noble Paul
> Attachments: ConcurrentLRUCache.java
>
>
> The only available SolrCache i.e LRUCache is based on _LinkedHashMap_ which 
> has _get()_ also synchronized. This can cause severe bottlenecks for faceted 
> search. Any alternate implementation which can be faster/better must be 
> considered. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-667) Alternate LRUCache implementation


[ 
https://issues.apache.org/jira/browse/SOLR-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618805#action_12618805
 ] 

Fuad Efendi commented on SOLR-667:
--

Paul, 


I have never ever suggested to use 'volatile'  'to avoid synchronization' for 
concurrent programming. I only noticed some extremely stupid code where SOLR 
uses _double_synchronization and AtomicLong inside:

{code}
  public synchronized Object put(Object key, Object value) {
if (state == State.LIVE) {
  stats.inserts.incrementAndGet();
}

synchronized (map) {
  // increment local inserts regardless of state???
  // it does make it more consistent with the current size...
  inserts++;
  return map.put(key,value);
}
  }
{code}

Each tool has an area of applicability, and even ConcurrentHashMap just 
slightly intersects with SOLR needs; SOLR does not need 'consistent view at a 
point in time' on cached objects.

'volatile' is part of Java Specs, and implemented differently by different 
vendors. I use volatile (instead of more expensive AtomicLong) only and only to 
prevent JVM HotSpot Optimizer from some _not-applicable_ staff...

> Alternate LRUCache implementation
> -
>
> Key: SOLR-667
> URL: https://issues.apache.org/jira/browse/SOLR-667
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Noble Paul
> Attachments: ConcurrentLRUCache.java
>
>
> The only available SolrCache i.e LRUCache is based on _LinkedHashMap_ which 
> has _get()_ also synchronized. This can cause severe bottlenecks for faceted 
> search. Any alternate implementation which can be faster/better must be 
> considered. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618766#action_12618766
 ] 

Fuad Efendi commented on SOLR-665:
--

I don't think ConcurrentHashMap will improve performance, and ConcurrentMap is 
not what SOLR needs:
{code}
V putIfAbsent(K key, V value);
V replace(K key, V value);
boolean replace(K key, V oldValue, V newValue);
{code}

There is also some(...) overhead with _oldValue_ and _the state of the hash 
table at some point_; additional memory requirements; etc... can we design 
something plain-simpler being focused on SOLR specific requirements? Without 
all functionality of Map etc...

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, 
> SimplestConcurrentLRUCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-667) Alternate LRUCache implementation


[ 
https://issues.apache.org/jira/browse/SOLR-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618750#action_12618750
 ] 

Fuad Efendi commented on SOLR-667:
--

bq. ...safety, where nothing bad ever happens to an object. 
When _SOLR_ adds object to cache or remove it from cache it does not change it, 
it manipulates with internal arrays of pointers to objects (which are probably 
atomic, but I don't know such JVM & GC internals in-depth...)

Looks heavy with TreeSet...


> Alternate LRUCache implementation
> -
>
> Key: SOLR-667
> URL: https://issues.apache.org/jira/browse/SOLR-667
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Noble Paul
> Attachments: ConcurrentLRUCache.java
>
>
> The only available SolrCache i.e LRUCache is based on _LinkedHashMap_ which 
> has _get()_ also synchronized. This can cause severe bottlenecks for faceted 
> search. Any alternate implementation which can be faster/better must be 
> considered. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-669) SOLR currently does not support caching for (Query, FacetFieldList)


[ 
https://issues.apache.org/jira/browse/SOLR-669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618578#action_12618578
 ] 

Fuad Efendi commented on SOLR-669:
--

To confirm: 
- SOLR uses Lucene internals (with caching) only if field is non-tokenized 
single-valued non-boolean, and SOLR does not have own cache to store calculated 
intersections (_faceting_).


> SOLR currently does not support caching for (Query, FacetFieldList)
> ---
>
> Key: SOLR-669
> URL: https://issues.apache.org/jira/browse/SOLR-669
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>    Reporter: Fuad Efendi
>   Original Estimate: 1680h
>  Remaining Estimate: 1680h
>
> It is huge performance bottleneck and it describes huge difference between 
> qtime and SolrJ's elapsedTime. I quickly browsed SolrIndexSearcher: it caches 
> only (Key, DocSet/DocList ) key-value pairs and it does not have 
> cache for (Query, FacetFieldList).
> filterCache stores DocList for each 'filter' and is used for constant 
> recalculations...
> This would be significant performance improvement.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-669) SOLR currently does not support caching for (Query, FacetFieldList)


[ 
https://issues.apache.org/jira/browse/SOLR-669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618574#action_12618574
 ] 

Fuad Efendi commented on SOLR-669:
--

This piece of code in SimpleFacets:
{code}
if (sf.multiValued() || ft.isTokenized() || ft instanceof BoolField) {
  // Always use filters for booleans... we know the number of values is 
very small.
  counts = getFacetTermEnumCounts(searcher, docs, field, offset, limit, 
mincount,missing,sort,prefix);
} else {
  // TODO: future logic could use filters instead of the fieldcache if
  // the number of terms in the field is small enough.
  counts = getFieldCacheCounts(searcher, docs, field, offset,limit, 
mincount, missing, sort, prefix);
}
{code}

- optimization for single-valued non-tokenized... 'Lucene FieldCache to get 
counts for each unique field value in docs'


We should implement *additional* caching to support this _the FilterCache to 
get the intersection_; FilterCache stores DocSet only and does not store 
NamedList of field-intersections:

{code}
/**
   * Returns a list of terms in the specified field along with the 
   * corresponding count of documents in the set that match that constraint.
   * This method uses the FilterCache to get the intersection count between 
docs
   * and the DocSet for each term in the filter.
   *
   * @see FacetParams#FACET_LIMIT
   * @see FacetParams#FACET_ZEROS
   * @see FacetParams#FACET_MISSING
   */
  public NamedList getFacetTermEnumCounts(SolrIndexSearcher searcher, DocSet 
docs, String field, int offset, int limit, int mincount, boolean missing, 
boolean sort, String prefix)
throws IOException {
...
}
{code}


> SOLR currently does not support caching for (Query, FacetFieldList)
> ---
>
> Key: SOLR-669
> URL: https://issues.apache.org/jira/browse/SOLR-669
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>Reporter: Fuad Efendi
>   Original Estimate: 1680h
>  Remaining Estimate: 1680h
>
> It is huge performance bottleneck and it describes huge difference between 
> qtime and SolrJ's elapsedTime. I quickly browsed SolrIndexSearcher: it caches 
> only (Key, DocSet/DocList ) key-value pairs and it does not have 
> cache for (Query, FacetFieldList).
> filterCache stores DocList for each 'filter' and is used for constant 
> recalculations...
> This would be significant performance improvement.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

 
we need to browse JavaSource... LinkedHashMap extends HashMap and there is no 
any 'traversal',
{code}
public V get(Object key) {
Entry e = (Entry)getEntry(key);
if (e == null)
return null;
e.recordAccess(this);
return e.value;
}
{code}


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

bq. That said, if you can show conclusively (e.g. with a profiler) that the 
synchronized access is indeed the bottleneck and incurs a heavy penalty on 
performance, then I'm all for investigating this further.

*What?!!*


Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 

  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, 
> SimplestConcurrentLRUCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

 is not the case. This is why 
we need to browse JavaSource... LinkedHashMap extends HashMap and there is no 
any 'traversal',
{code}
public V get(Object key) {
Entry e = (Entry)getEntry(key);
if (e == null)
return null;
e.recordAccess(this);
return e.value;
}
{code}


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

bq. That said, if you can show conclusively (e.g. with a profiler) that the 
synchronized access is indeed the bottleneck and incurs a heavy penalty on 
performance, then I'm all for investigating this further.

*What?!!*


Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 

  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, 
> SimplestConcurrentLRUCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

ce... LinkedHashMap extends HashMap and there is no 
any 'traversal',
{code}
public V get(Object key) {
Entry e = (Entry)getEntry(key);
if (e == null)
return null;
e.recordAccess(this);
return e.value;
}
{code}


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

bq. That said, if you can show conclusively (e.g. with a profiler) that the 
synchronized access is indeed the bottleneck and incurs a heavy penalty on 
performance, then I'm all for investigating this further.

*What?!!*


Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 

  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, 
> SimplestConcurrentLRUCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-669) SOLR currently does not support caching for (Query, FacetFieldList)


 [ 
https://issues.apache.org/jira/browse/SOLR-669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-669:
-

Remaining Estimate: 1680h  (was: 0.03h)
 Original Estimate: 1680h  (was: 0.03h)

> SOLR currently does not support caching for (Query, FacetFieldList)
> ---
>
> Key: SOLR-669
> URL: https://issues.apache.org/jira/browse/SOLR-669
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>    Reporter: Fuad Efendi
>   Original Estimate: 1680h
>  Remaining Estimate: 1680h
>
> It is huge performance bottleneck and it describes huge difference between 
> qtime and SolrJ's elapsedTime. I quickly browsed SolrIndexSearcher: it caches 
> only (Key, DocSet/DocList ) key-value pairs and it does not have 
> cache for (Query, FacetFieldList).
> filterCache stores DocList for each 'filter' and is used for constant 
> recalculations...
> This would be significant performance improvement.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-669) SOLR currently does not support caching for (Query, FacetFieldList)

SOLR currently does not support caching for (Query, FacetFieldList)
---

 Key: SOLR-669
 URL: https://issues.apache.org/jira/browse/SOLR-669
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.3
Reporter: Fuad Efendi


It is huge performance bottleneck and it describes huge difference between 
qtime and SolrJ's elapsedTime. I quickly browsed SolrIndexSearcher: it caches 
only (Key, DocSet/DocList ) key-value pairs and it does not have 
cache for (Query, FacetFieldList).
filterCache stores DocList for each 'filter' and is used for constant 
recalculations...

This would be significant performance improvement.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


 [ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-665:
-

Attachment: SimplestConcurrentLRUCache.java

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>    Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, 
> SimplestConcurrentLRUCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


 [ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-665:
-

Attachment: (was: SimplestLRUCache.java)

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>    Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, 
> SimplestConcurrentLRUCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


 [ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-665:
-

Attachment: (was: ConcurrentLRUWeakCache.java)

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>    Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, 
> SimplestLRUCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


 [ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-665:
-

Attachment: (was: ConcurrentLRUWeakCache.java)

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>    Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, 
> SimplestLRUCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


 [ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-665:
-

Attachment: SimplestLRUCache.java

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>    Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, 
> ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, 
> SimplestLRUCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


 [ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-665:
-

Attachment: (was: SimplestLRUCache.java)

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>    Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, 
> ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, 
> SimplestLRUCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


 [ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-665:
-

Attachment: SimplestLRUCache.java

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>    Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, 
> ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java, 
> SimplestLRUCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618339#action_12618339
 ] 

funtick edited comment on SOLR-665 at 7/30/08 7:08 AM:
---

Noble, thanks for feedback!

Of course my code is buggy but I only wanted _to illustrate_ simplest idea; I 
am extremely busy with other staff (Liferay) and can't focus on SOLR 
improvements... may be during weekend.

bq. ...will always evaluate to false. And the reference will always have one 
value
- yes, this is bug. There are other bugs too...

bq. We must be removing the entry which was accessed first (not last).. 
I mean (and code too) the same; probably wrong wording

bq. And the static volatile counter is not threadsafe. 
Do we _really-really_ need thread safety here? By using 'volatile' I only 
prevent _some_ JVMs from trying to optimize some code (and cause problems).

bq. There is no need to use a WeakReference anywhere
Agree... 

bq. To get that you must maintian a linkedlist the way linkedhashmap maintains. 
No other shortcut. 
May be... but looks similar to Arrays.sort(), or TreeSet, and etc I am 
trying to avoid this. 'No other shortcut' - may be, but I am unsure.

Thanks!



  was (Author: funtick):
Noble, thanks for feedback!

Of course my code is buggy but I only wanted _to illustrate_ simplest idea; I 
am extremely busy with other staff (Liferay) and can't focus on SOLR 
improvements... may be during weekend.

bq. ...will always evaluate to false. And the reference will always have one 
value
- yes, this is bug. There are other bugs too...

bq. We must be removing the entry which was accessed first (not last).. 
I mean (and code too) the same; probably wrong wording

bq. And the static volatile counter is not threadsafe. 
Do we _really-really_ need thread safety here? By using 'volatile' I only 
prevent _some_ JVMs from trying to optimize some code (and cause problems with 
per-instance variables which never change).

bq. There is no need to use a WeakReference anywhere
Agree... 

bq. To get that you must maintian a linkedlist the way linkedhashmap maintains. 
No other shortcut. 
May be... but looks similar to Arrays.sort(), or TreeSet, and etc I am 
trying to avoid this. 'No other shortcut' - may be, but I am unsure.

Thanks!


  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, 
> ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618339#action_12618339
 ] 

funtick edited comment on SOLR-665 at 7/30/08 7:06 AM:
---

Noble, thanks for feedback!

Of course my code is buggy but I only wanted _to illustrate_ simplest idea; I 
am extremely busy with other staff (Liferay) and can't focus on SOLR 
improvements... may be during weekend.

bq. ...will always evaluate to false. And the reference will always have one 
value
- yes, this is bug. There are other bugs too...

bq. We must be removing the entry which was accessed first (not last).. 
I mean (and code too) the same; probably wrong wording

bq. And the static volatile counter is not threadsafe. 
Do we _really-really_ need thread safety here? By using 'volatile' I only 
prevent _some_ JVMs from trying to optimize some code (and cause problems with 
per-instance variables which never change).

bq. There is no need to use a WeakReference anywhere
Agree... 

bq. To get that you must maintian a linkedlist the way linkedhashmap maintains. 
No other shortcut. 
May be... but looks similar to Arrays.sort(), or TreeSet, and etc I am 
trying to avoid this. 'No other shortcut' - may be, but I am unsure.

Thanks!



  was (Author: funtick):
Nobble, thanks for feedback!

Of course my code is buggy but I only wanted _to illustrate_ simplest idea; I 
am extremely busy with other staff (Liferay) and can't focus on SOLR 
improvements... may be during weekend.

bq. ...will always evaluate to false. And the reference will always have one 
value
- yes, this is bug. There are other bugs too...

bq. We must be removing the entry which was accessed first (not last).. 
I mean (and code too) the same; probably wrong wording

bq. And the static volatile counter is not threadsafe. 
Do we _really-really_ need thread safety here? By using 'volatile' I only 
prevent _some_ JVMs from trying to optimize some code (and cause problems with 
per-instance variables which never change).

bq. There is no need to use a WeakReference anywhere
Agree... 

bq. To get that you must maintian a linkedlist the way linkedhashmap maintains. 
No other shortcut. 
May be... but looks similar to Arrays.sort(), or TreeSet, and etc I am 
trying to avoid this. 'No other shortcut' - may be, but I am unsure.

Thanks!


  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, 
> ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618339#action_12618339
 ] 

Fuad Efendi commented on SOLR-665:
--

Nobble, thanks for feedback!

Of course my code is buggy but I only wanted _to illustrate_ simplest idea; I 
am extremely busy with other staff (Liferay) and can't focus on SOLR 
improvements... may be during weekend.

bq. ...will always evaluate to false. And the reference will always have one 
value
- yes, this is bug. There are other bugs too...

bq. We must be removing the entry which was accessed first (not last).. 
I mean (and code too) the same; probably wrong wording

bq. And the static volatile counter is not threadsafe. 
Do we _really-really_ need thread safety here? By using 'volatile' I only 
prevent _some_ JVMs from trying to optimize some code (and cause problems with 
per-instance variables which never change).

bq. There is no need to use a WeakReference anywhere
Agree... 

bq. To get that you must maintian a linkedlist the way linkedhashmap maintains. 
No other shortcut. 
May be... but looks similar to Arrays.sort(), or TreeSet, and etc I am 
trying to avoid this. 'No other shortcut' - may be, but I am unsure.

Thanks!



> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, 
> ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-29 Thread Fuad Efendi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-665:
-

Attachment: ConcurrentLRUWeakCache.java

another bug... and AtomicReference is generic... never used it before. Could be 
even 'weaker' if we use 'hashcode' which is long (and atomic) instead of Key 
which is object (and unsafe), and distribution of hashcode is ok...

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, 
> ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-29 Thread Fuad Efendi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-665:
-

Attachment: ConcurrentLRUWeakCache.java

bug fix

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>    Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, 
> ConcurrentLRUWeakCache.java, FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-29 Thread Fuad Efendi (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-665:
-

Attachment: ConcurrentLRUWeakCache.java

Thanks Paul, I am always trying to simplify... AtomicReference is a pointer to 
(approximately) least recently used 'key'...

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617699#action_12617699
 ] 

Fuad Efendi commented on SOLR-665:
--

Paul, I want to do it... understand me, since February I am having constant 
performance problems with faceted queries (15-20 seconds response time), I 
ordered new server with 32 Gb & 2x quad-core (8x times more power!) but it 
didn't improve performance; finally I commented sync in LRUCache and made it 
FIFO... I was very impatient with this post, just tried to share very real 
staff... 

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617692#action_12617692
 ] 

Fuad Efendi commented on SOLR-665:
--

BTW there is almost no any functional difference between LRU and FIFO. And 
there is *huge difference* between LRU (Least Recently Used) and LFU (Least 
Frequently Used).
It's easy to implement ConcurrentLFU based on provided ConcurrentLRU template; 
of course, following the main _contract_ org.apache.solr.search.SolrCache.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617684#action_12617684
 ] 

funtick edited comment on SOLR-665 at 7/28/08 9:20 PM:
---

bq. Fuad is after fastest-possible reads, everybody is after reasonable 
behavior in the face of concurrent writes

Thanks and sorry for runtime errors;

FIFO looks strange at first, but... for large cache (10 items), most 
popular item can be _mistakenly_ removed... but I don't think there are any 
'most popular facets' etc.; it's evenly distributed in most cases.

Another issue: SOLR always tries _recalculate_ _facets_ even with extremely 
large filterCache & queryResultCache, even the same faceted query shows always 
the same long response times.

bq. It is if nothing is modifying the map during the get. If something is 
modifying the map you don't know how the implementation handles the insert of a 
new value. It might copy the object, and you'd end up with half an object or 
even an invalid memory location. That's why the javadoc says that you must 
synchronize accesses if anything modifies the map - this is not limited to 
iterators.

I agree of course... However, we are not dealing with unknown implementation of 
java.util.Map clonig (java.lang.Cloneable) objects somehow or using some weird 
object introspection etc 

  was (Author: funtick):
bq. Fuad is after fastest-possible reads, everybody is after reasonable 
behavior in the face of concurrent writes

Thanks and sorry for runtime errors;

FIFO looks strange at first, but... for large cache (10 items), most 
popular item can be _mistakenly_ removed... but I don't think there are any 
'most popular facets' etc.; it's evenly distributed in most cases.

Another issue: SOLR always tries _recalculate_ _facets_ even with extremely 
large filterCache & queryResultCache, even the same faceted query shows always 
the same long response times.

  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>    Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617684#action_12617684
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. Fuad is after fastest-possible reads, everybody is after reasonable 
behavior in the face of concurrent writes

Thanks and sorry for runtime errors;

FIFO looks strange at first, but... for large cache (10 items), most 
popular item can be _mistakenly_ removed... but I don't think there are any 
'most popular facets' etc.; it's evenly distributed in most cases.

Another issue: SOLR always tries _recalculate_ _facets_ even with extremely 
large filterCache & queryResultCache, even the same faceted query shows always 
the same long response times.


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

p and there is no 
any 'traversal',
{code}
public V get(Object key) {
Entry e = (Entry)getEntry(key);
if (e == null)
return null;
e.recordAccess(this);
return e.value;
}
{code}


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 

  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
>     URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

ot iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 

  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

[
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617654#action_12617654
]

funtick edited comment on SOLR-665 at 7/28/08 8:41 PM:
---

bq. The Solr admin pages will not give you exact measurements.
Yes, and I do not need exact measurements! It gives me averageTimePerRequest
which improved almost 10 times on production server. Should I right JUnit tests
and execute it in a single-threaded environment? Better is to use The Grinder,
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second.
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at
all?
Yes, I am using highlighting. You can see it at http://www.tokenizer.org

bq. I'm guessing that you're caching the results of all queries in memory such
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large
caches but SOLR still *recalculates* facet intersections. {color}

bq. A FIFO cache might become a bottleneck itself - if the cache is very large
and the most frequently accessed item is inserted just after the cache is
created, all accesses will need to traverse all the other entries before
getting that item.

- sorry, I didn't understand... yes, if cache contains 10 entries and 'most
popular item' is removed... Why 'traverse all the other entries before getting
that item'? why 9 items are less popular (cumulative) than single one
(absolute)?

bq. Consider the following case: thread A performs a synchronized put, thread B
performs an unsynchronized get on the same key. B gets scheduled before A
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map
structurally!
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer
for all instances. We can use System.currentTimeMillis() instead, but static
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for
faceted queries I am very happy... I had 15 seconds before going with FIFO.

was (Author: funtick):
bq. The Solr admin pages will not give you exact measurements.
Yes, and I do not need exact measurements! It gives me averageTimePerRequest
which improved almost 10 times on production server. Should I right JUnit tests
and execute it in a single-threaded environment? Better is to use The Grinder,
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second.
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at
all?
Yes, I am using highlighting. You can see it at http://www.tokenizer.org

bq. That's exactly the case here - the update thread modifies the map
structurally!
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is
easier to understand and troubleshoot...

About specific use case: yes... if someone has 0.5 seconds response time for
faceted queries I am very happy... I had 15 seconds before going with FIFO.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
> Issue Type: Improvement
>Affects Versions: 1.3
&

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617654#action_12617654
 ] 

funtick edited comment on SOLR-665 at 7/28/08 8:36 PM:
---

bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 


  was (Author: funtick):
bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 

  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617673#action_12617673
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. The it references the map. It says explicitely that the map must be 
synchronized.

- I agree, thanks for pointing to it. Synchronize!

BTW, Joshua Bloch developed Arrays.sort(), and bug was found after 9 years. 
Nothing is perfect.

ConcurrentLRU looks extremely simple and easy to improve. Should we check SUN's 
bug database before using ConcurrentHashMap? It has some related...

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617668#action_12617668
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. No, it is not! Your analysis seems to ignore the java memory model 
(partially constructed objects and all that). I don't know how many different 
ways to say it please do yourself a favor and read up on the java memory 
model (and the book I previously referenced is great for this). This is hard 
stuff (at the lowest levels).

Ok. May be I can get reference to wrong object type, or even object scheduled 
for finalization... But we are not inserting into Map 'partially constructed 
objects', isn't it?

Simplest scenario: Thread A tries to get variable (4 bytes of address in JVM) 
pointing to object O. Another thread B concurrently assigns _null_ to that 
variable. Isn't it solved at CPU level yet? Or, may be on 64bit system thread B 
assigns zero to first 2 bytes, and then to another 2 bytes?

I need to study this book... BTW, I am running JVM with '-server' option.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617664#action_12617664
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. It is if nothing is modifying the map during the get. If something is 
modifying the map you don't know how the implementation handles the insert of a 
new value. It might copy the object, and you'd end up with half an object or 
even an invalid memory location. That's why the javadoc says that you must 
synchronize accesses if anything modifies the map - this is not limited to 
iterators.

JavaDoc does not say that. Instead, it says (I am repeating):
bq. ..and at least one of the threads modifies the map structurally, it must be 
synchronized externally. 

- only thread doing  structural modification must be synchronized. In case of 
LinkedHashMap, for instance, we need to synchronize inserts in order to avoid 
Entry instances referencing themselves (orphans).


bq. you don't know how the implementation handles the insert of a new value

I know exactly: SOLR does not modify 'value' during 'insert', Map.Entry 
instances are immutable in SOLR, etc. Table resize is main problem - but after 
analyzing source code I don't see any problem. Consern that 'wrong value will 
be returned for a key' is not applicable. And JavaDocs does not say anything 
about that. Collections internally use Map.Entry in an immutable way, do not 
change it.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617657#action_12617657
 ] 

funtick edited comment on SOLR-665 at 7/28/08 7:36 PM:
---

Lars, I used FIFO because it is extremely simple to get unsynchronized _get()_:
{code}
map = new LinkedHashMap(initialSize, 0.75f, true)  - LRU Cache
(and we need synchronized get())
map = new LinkedHashMap(initialSize, 0.75f, false) - FIFO
(and we do not need synchronized get()) 
{code}

Yonik, I'll try to improve ConcurrentLRU and to share findings... of course 
FIFO is not what we need.

bq. No it doesn't... think linked-list. It moves a single item, which is pretty 
fast.
yes, so I wrote 'evenly distributed between several get() so we can't see it' - 
it keeps List ordered and we can't unsynchronize it with all subsequences!!!

  was (Author: funtick):
Lars, I used FIFO because it is extremely simple to get unsynchronized 
_get()_:
{code}
map = new LinkedHashMap(initialSize, 0.75f, true)  - LRU Cache
(and we need synchronized get())
map = new LinkedHashMap(initialSize, 0.75f, false) - FIFO
(and we do not need synchronized get()) 
{code}

Yonik, I'll try to improve ConcurrentLRU and to share findings... of course 
FIFO is not what we need.
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617657#action_12617657
 ] 

Fuad Efendi commented on SOLR-665:
--

Lars, I used FIFO because it is extremely simple to get unsynchronized _get()_:
{code}
map = new LinkedHashMap(initialSize, 0.75f, true)  - LRU Cache
(and we need synchronized get())
map = new LinkedHashMap(initialSize, 0.75f, false) - FIFO
(and we do not need synchronized get()) 
{code}

Yonik, I'll try to improve ConcurrentLRU and to share findings... of course 
FIFO is not what we need.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617655#action_12617655
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. The eviction code looks like it would be relatively expensive

but get() method of LinkedHashMap reorders whole map!!! (Of course, CPU load is 
evenly distributed between several get() so that we can't see it) Other 
implementations even use Arrays.sort() or something similar. I don't see easier 
solution than that... probably some random-access policy with predictable range 
of "popularity", we can evict anything 'old' and not necessarily 'eldest'...


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>     Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617654#action_12617654
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

 }

public boolean equals(Object o) {
if (!(o instanceof ValueWrapper)) {
return false;
}
return (value == null ? ((ValueWrapper) o).value == 
null : value.equals(((ValueWrapper) o).value));
}

public int hashCode() {
return value.hashCode();
}

public V getValue() {
popularity = popularityCounter++;
return value;
}

}

}
{code}
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

r)) {
return false;
}
return (value == null ? ((ValueWrapper) o).value == 
null : value.equals(((ValueWrapper) o).value));
}

public int hashCode() {
return value.hashCode();
}

public V getValue() {
popularity = popularityCounter++;
return value;
}

}

}
{code}
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617639#action_12617639
 ] 

Fuad Efendi commented on SOLR-665:
--

This is extremely simple Concurrent LRU, I spent an hour to create it; it is 
based on ConcurrentHashMap. I don't use java.util.concurrent.locks, and I am 
trying to focus on _requirements only_ avoiding implementing unnecessary 
methods of Map interface (so that I am not following _contract_ ;) very sorry!)

{code}
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

public class ConcurrentLRU {

protected ConcurrentHashMap> map;
protected int maxEntries;

public ConcurrentLRU(int maxEntries) {
map = new ConcurrentHashMap>();
this.maxEntries = maxEntries;
}

public V put(K key, V value) {
ValueWrapper wrapper = map.put(key, new 
ValueWrapper(value));
checkRemove();
return value;
}

void checkRemove() {
if (map.size() <= maxEntries)
return;
Map.Entry> eldestEntry = null;
long eldestAge = Long.MAX_VALUE;
for (Map.Entry> entry : map.entrySet()) {
long popularity = entry.getValue().popularity;
if (eldestEntry == null || 
eldestEntry.getValue().popularity > popularity) {
eldestEntry = entry;
}
}
map.remove(eldestEntry.getKey(), eldestEntry.getValue());
}

public V get(Object key) {
ValueWrapper wrapper = map.get(key);
return wrapper == null ? null : wrapper.getValue();
}

public final static class ValueWrapper {
static volatile long popularityCounter;
volatile long popularity;
V value;

ValueWrapper(V value) {
this.value = value;
popularity = popularityCounter++;
}

public boolean equals(Object o) {
if (!(o instanceof ValueWrapper)) {
return false;
}
return (value == null ? ((ValueWrapper) o).value == 
null : value.equals(((ValueWrapper) o).value));
}

public int hashCode() {
return value.hashCode();
}

public V getValue() {
popularity = popularityCounter++;
return value;
}

}

}
{code}

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

apacity = newTable.length;
for (int j = 0; j < src.length; j++) {
Entry e = src[j];
if (e != null) {
src[j] = null;
do {
Entry next = e.next;
int i = indexFor(e.hash, newCapacity);  
e.next = newTable[i];
newTable[i] = e;
e = next;
} while (e != null);
}
}
}
{code}

- We won't have even any NullPointerException after src[j] = null.

P.S.
Of course, I agree - it is Java internals, and it is not public Map 
interface-_contract_ - should we avoid to use implementation then? I believe it 
is specified somewhere in JSR too...
{code}
 * @author  Doug Lea
 * @author  Josh Bloch
 * @author  Arthur van Hoff
 * @author  Neal Gafter
 * @version 1.65, 03/03/05
{code}

P.P.S.
Do not forget to look at the top of this discussion:
{code}
description: xxx Cache(maxSize=1000, initialSize=1000) 
size : 2668705
cumulative_inserts : 4246246
{code}

- _cumulative_inserts_ is almost double of _size_ which shows that 
double-inserts are real 
- I checked catalina_out: no any NullPointerException, 
ArrayIndexOutOfBoundsException, and etc.
- I don't think we should be worried _too much_ about possible change of Map 
implementation by SUN :P... in this case we should use neither java.lang.String 
nor java.util.Date (some are placed in wrong packages).
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

 src[j] = null;
do {
Entry next = e.next;
int i = indexFor(e.hash, newCapacity);  
e.next = newTable[i];
newTable[i] = e;
e = next;
} while (e != null);
}
}
}
{code}

- We won't have even any NullPointerException after src[j] = null.

P.S.
Of course, I agree - it is Java internals, and it is not public Map 
interface-_contract_ - should we avoid to use implementation then? I believe it 
is specified somewhere in JSR too...
{code}
 * @author  Doug Lea
 * @author  Josh Bloch
 * @author  Arthur van Hoff
 * @author  Neal Gafter
 * @version 1.65, 03/03/05
{code}
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617562#action_12617562
 ] 

Fuad Efendi commented on SOLR-665:
--

{quote}Simply replacing synchronized with java.util.concurrent.locks doesn't 
increase performance. There needs to be a specific strategy for employing these 
locks in a way that makes sense.{quote}

I absolutely agree... 

ConcurrentHashMap is based on some level of acceptable safety, for specific 
tasks only
bq. They do not throw ConcurrentModificationException. However, iterators are 
designed to be used by only one thread at a time. 

We can try to design specific caches directly implementing Map or ConcurrentMap 
interfaces. We should define 'safety' levels (for instance, _null_ is not a 
problem if cache already contains this object added by another thread 
concurrently; cache elements are explicitly immutable objects; and etc.) 
FIFO looks simplest and it _does_ work indeed; for LRU we need reordering for 
each get(), _OR_ we can make it weaker using weak (approximate) reordering 
somehow...

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

 believe it is specified 
somewhere in JSR too...
{code}
 * @author  Doug Lea
 * @author  Josh Bloch
 * @author  Arthur van Hoff
 * @author  Neal Gafter
 * @version 1.65, 03/03/05
{code}
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
 ] 

funtick edited comment on SOLR-665 at 7/28/08 1:04 PM:
---

concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

{code}
/** 
 * Transfer all entries from current table to newTable.
 */
void transfer(Entry[] newTable) {
Entry[] src = table;
int newCapacity = newTable.length;
for (int j = 0; j < src.length; j++) {
Entry e = src[j];
if (e != null) {
src[j] = null;
do {
Entry next = e.next;
int i = indexFor(e.hash, newCapacity);  
e.next = newTable[i];
newTable[i] = e;
e = next;
} while (e != null);
}
}
}
{code}

- We won't have even any NullPointerException after src[j] = null.

P.S.
Of course, I agree - it is Java internals, and it is not public Map 
interface-_contract_ - should we avoid to use implementation then? and base 
decision on specific implementation from SUN? I believe it is specified 
somewhere in JSR too...
{code}
 * @author  Doug Lea
 * @author  Josh Bloch
 * @author  Arthur van Hoff
 * @author  Neal Gafter
 * @version 1.65, 03/03/05
{code}

  was (Author: funtick):
concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>     Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchron

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
 ] 

funtick edited comment on SOLR-665 at 7/28/08 12:54 PM:


concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

  was (Author: funtick):
concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
 ] 

funtick edited comment on SOLR-665 at 7/28/08 12:51 PM:


concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

  was (Author: funtick):
concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.

  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
 ] 

Fuad Efendi commented on SOLR-665:
--

concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617523#action_12617523
 ] 

Fuad Efendi commented on SOLR-665:
--

We need to invite Doug Lea to this discussion...
http://en.wikipedia.org/wiki/Doug_Lea
http://gee.cs.oswego.edu/dl/index.html

We may simply use java.util.concurrent.locks instead of heavy synchronized... 
we may also use Executor framework instead of single-thread faceting... We may 
even base SOLR on Apache MINA project.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617520#action_12617520
 ] 

Fuad Efendi commented on SOLR-665:
--

Shalin, we are already using _AtomicLong of Java 5_; JVM is defferent story... 
JRockit R27 is JVM from BEA-Oracle, and its JDK 6 powered (rt.jar comes from 
SUN).
I just tried to compare synchronized with unsynchronized and found it _is_ the 
main problem for faceting...
Another problem: somehow faceting recalculates each time (using filterCache 
during repeated _recalculations_), queryCache is not enough...

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617516#action_12617516
 ] 

funtick edited comment on SOLR-665 at 7/28/08 12:14 PM:


Mark, thanks for going deeper. Yes, _resize_ may change ( Entry[ ] ) _table_, 
_key_ will disappear from _bucket_ and _get() returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.

- get() will _never_ return wrong value (Entry object is immutable in our case):
{code}
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
{code}



  was (Author: funtick):
Mark, thanks for going deeper. Yes, _resize_ may change ( Entry[ ] ) 
_table_, _key_ will disappear from _bucket_ and _get() returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.

- get() will _never_ return wrong value (Entry object is immutable):
{code}
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
{code}


  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>     Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617516#action_12617516
 ] 

funtick edited comment on SOLR-665 at 7/28/08 12:12 PM:


Mark, thanks for going deeper. Yes, _resize_ may change ( Entry[ ] ) _table_, 
_key_ will disappear from _bucket_ and _get() returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.

- get() will _never_ return wrong value (Entry object is immutable):
{code}
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
{code}



  was (Author: funtick):
Mark, thanks for going deeper. Yes, _resize_ may change ( Entry[ ] ) 
_table_, _key_ will disappear from _bucket_ and _get() returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617516#action_12617516
 ] 

funtick edited comment on SOLR-665 at 7/28/08 12:06 PM:


Mark, thanks for going deeper. Yes, _resize_ may change ( Entry[ ] ) _table_, 
_key_ will disappear from _bucket_ and _get() returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.

  was (Author: funtick):
Mark, thanks for going deeper. Yes, _resize_ may change 
({code}Entry[]{code}) _table_, _key_ will disappear from _bucket_ and _get() 
returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617516#action_12617516
 ] 

Fuad Efendi commented on SOLR-665:
--

Mark, thanks for going deeper. Yes, _resize_ may change ({code}Entry[]{code}) 
_table_, _key_ will disappear from _bucket_ and _get() returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617482#action_12617482
 ] 

Fuad Efendi commented on SOLR-665:
--

ConcurrentModificationException is thrown only when we iterate Map and another 
thread modified structure; with LRU each get() modifies structure, with FIFO - 
only inserts...

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617481#action_12617481
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. ...and at least one of the threads modifies the map structurally, it 
must be synchronized externally.

- so that thread running on _insert_ must be synchronized and _not_ thread 
running on _retrieve_. Again, we need synchronize _insert_ only to avoid memory 
leaks, not more. SOLR does not iterate Map.


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Reopened: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


 [ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi reopened SOLR-665:
--


get() method do not need to be synchronized for FIFO cache. Unsynchronized 
object retrieval is not structural modification of _Insertion-Ordered_ 
LinkedHashMap. Unsynchronized cache improves performance of multithreaded 
applications linear to number of CPUs.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>    Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617466#action_12617466
 ] 

Fuad Efendi commented on SOLR-665:
--

Why? We have PoC at least, and we know the bottleneck! 
We need improvement: avoid _some_ synchronization; it is extremely easy with 
FIFO. We may try to improve LRU too.
Not everything in JAVA is extremely good: for instance, synchronization. Even 
for single-threaded application, it needs additionally 600 CPU cycles (I've 
read it somewhere for SUN Java 1.3 on Windows)

Yonik, please allow some time to think / to discuss. 

I'll try also to provide 'concurrent' LRU; but this issue is FIFO related.


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617454#action_12617454
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. If multiple threads access this map concurrently, and at least one of the 
threads modifies the map structurally, it must be synchronized externally. (A 
structural modification is any operation that adds or deletes one or more 
mappings; merely changing the value associated with a key that an instance 
already contains is not a structural modification.)

I already commented it. Just try to avoid 'books' and 'authors', also try to 
find meaning in JavaDocs and try to browse JavaSource instead...

- _structural_  _modification_, related _ConcurrentModificationException_, and 
related _Iterator_: not applicable to SOLR Cache. 

Will never happen. We need to synchronize inserts because each insert may 
calculate size and remove 'eldest' entry, and we need to avoid OOMs. We need to 
synchronize retrieves for LRU because 'Linked' HashMap (with Access Order) will 
change links (will reorder Map). And we do not need to synchronize retrieves 
from Insertion-Ordered LinkedHashMap (FIFO). It is classic... i'd like to 
research more java.util.concurrent


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617445#action_12617445
 ] 

Fuad Efendi commented on SOLR-665:
--

bq. bq. absolutely no need to synchronize get() method for FIFO!

bq. A cache is not a read-only object. Gets need to be synchronized because 
other threads can be changing the cache.

I am familiar with Doug Lea's findings (he wrote his book in 1996, and it 
became part of java.util.concurrent after 10(!!!) years). 

bq. changing the cache
- what is 'cache change'? 
Changing the stored Key, changing the referenced object - never ever happens in 
SOLR. Removal of object - yes. More correctly: removal of key. get(MyKey) will 
return null OR will return MyValue, and "ConcurentCacheModification" will never 
be thrown in SOLR (we are not using Iterator!). We can insert concurrently the 
same (key, value) pairs - not a problem.


> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617443#action_12617443
 ] 

Fuad Efendi commented on SOLR-665:
--

- Classes from java.util.concurrent.atomic designed NOT to be synchronized, 
per-instance stats should be replaced to AtomicLong instead of {{long}}:
{{  private long lookups;
  private long hits;
  private long inserts;
  private long evictions;}}
- get() method of FIFO do not need any synchronization
- get() method of LRU reorers LinkedHashMap, unsynchronized access may cause 
orphan Entry objects
- synchronized insertion for FIFO won't cause performance degradation because 
get() is unsynchronized.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617441#action_12617441
 ] 

Fuad Efendi commented on SOLR-665:
--

Regarding Thread Safety:

- yes, we need synchronized get() method for LRU cache because each access 
_reorders_ LinkedHashMap.
- absolutely no need to synchronize get() method for FIFO!
- probably we need to deal with insertion, but do not synchronize it: instead, 
extend LinkedHashMap and make some 'removal' synchronized...  With caches large 
enough to store all object we do not need it. We probably do not need to 
synchronize 'removal' at all - it removes entry but does not remove/finalize 
referenced object.

>From JavaDoc: "Note that this implementation is not synchronized. If multiple 
>threads access a linked hash map concurrently, and at least
one of the threads modifies the map structurally, it must be  synchronized 
externally."
However, we do not modify cache structurally during iteration loop or any other 
'structure' access (we do not use Iterator!) - so, advice from JavaDoc is not 
applicable.

We should synchronize only removeEntryForKey of HashMap; unfortunately we can't 
do it without rewriting HashMap. Probably we can use ConcurrentHashMap as a 
base class of LinkedHashMap, but I don't know answer yet... I am guessing that 
unsynchronized entry removal won't be significant issue in multithreaded 
environment.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617359#action_12617359
 ] 

Fuad Efendi commented on SOLR-665:
--

Almost forgot: I am estimating performance gains basing on real application 
in-production, multithreaded, Tomcat 6.0.16 & JRockit R27 (Java 6) & SLES10 & 
two quad-core Opteron 2350 (8 CPUs total) & 25Gb for SOLR...
And, first queries run a long, more than a minute (warming up caches with 
faceted query id:[* TO *]) average Time-Per-Request decreases over time and it 
is now 1591.5232 giving 10x performance boost.
Facets are highly distributed as you can see from website and filterCache... 
HTTP caching is supported - to see real timing you should execute real 
queries...
ConcurrentHashMap is not applicable - we are not modifying cached item 
indeed... FIFO is without 'Out' if we have enough memory.

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617357#action_12617357
 ] 

Fuad Efendi commented on SOLR-665:
--

I renamed to FIFOCache just before opening an issue; in a local system it is 
(modified) LRUCache so that filterCache has reference to 'old' name...

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost


 [ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fuad Efendi updated SOLR-665:
-

Attachment: FIFOCache.java

> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>    Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost