[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617431#action_12617431
 ] 

noble.paul edited comment on SOLR-665 at 7/28/08 7:10 AM:
--

I guess this is safe and faster
{code}
public Object get(Object key) {
synchronized (map) {
  Object val = map.get(key);
  if (state == State.LIVE) {
// only increment lookups and hits if we are live.
lookups++;
  }
  stats.lookups.incrementAndGet();
  if (val != null) {
//hits++; we must remove the hits field. It needs changes to 
getStatistics also
stats.hits.incrementAndGet();
  }
  return val;
}
  }
{code}

let us remove the field hits and use stats.hits wherever we need it

  was (Author: noble.paul):
I guess this is safe and faster
{code}
public Object get(Object key) {
synchronized (map) {
  Object val = map.get(key);
  if (state == State.LIVE) {
// only increment lookups and hits if we are live.
lookups++;
  }
  stats.lookups.incrementAndGet();
  if (val != null) {
hits++;
stats.hits.incrementAndGet();
  }
  return val;
}
  }
{code}
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617431#action_12617431
 ] 

noble.paul edited comment on SOLR-665 at 7/28/08 7:19 AM:
--

I guess this is safe and faster
{code}
public Object get(Object key) {
synchronized (map) {
  Object val = map.get(key);
  if (state == State.LIVE) {
// only increment lookups and hits if we are live.
lookups++;
  }
}
  stats.lookups.incrementAndGet();
  if (val != null) {
//hits++; this must be removed
stats.hits.incrementAndGet();
  }
  return val;
  }
{code}

let us remove the field hits and use stats.hits wherever we need it

  was (Author: noble.paul):
I guess this is safe and faster
{code}
public Object get(Object key) {
synchronized (map) {
  Object val = map.get(key);
  if (state == State.LIVE) {
// only increment lookups and hits if we are live.
lookups++;
  }
  stats.lookups.incrementAndGet();
  if (val != null) {
//hits++; we must remove the hits field. It needs changes to 
getStatistics also
stats.hits.incrementAndGet();
  }
  return val;
}
  }
{code}

let us remove the field hits and use stats.hits wherever we need it
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617436#action_12617436
 ] 

[EMAIL PROTECTED] edited comment on SOLR-665 at 7/28/08 7:35 AM:


Your version changes what the method does in a couple of respects though.
bq. let us remove the field hits and use stats.hits wherever we need it
stats.hits is for all caches of this type (it's shared across searchers).  hits 
is local to this cache instance.

  was (Author: [EMAIL PROTECTED]):
Your version changes what the method does in a couple of respects though.
bq. let us remove the field hits and use stats.hits wherever we need it
stats.hits is for all caches of this time (it's shared across searchers).  hits 
is local to this cache instance.
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617516#action_12617516
 ] 

funtick edited comment on SOLR-665 at 7/28/08 12:06 PM:


Mark, thanks for going deeper. Yes, _resize_ may change ( Entry[ ] ) _table_, 
_key_ will disappear from _bucket_ and _get() returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.

  was (Author: funtick):
Mark, thanks for going deeper. Yes, _resize_ may change 
({code}Entry[]{code}) _table_, _key_ will disappear from _bucket_ and _get() 
returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617516#action_12617516
 ] 

funtick edited comment on SOLR-665 at 7/28/08 12:12 PM:


Mark, thanks for going deeper. Yes, _resize_ may change ( Entry[ ] ) _table_, 
_key_ will disappear from _bucket_ and _get() returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.

- get() will _never_ return wrong value (Entry object is immutable):
{code}
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
{code}



  was (Author: funtick):
Mark, thanks for going deeper. Yes, _resize_ may change ( Entry[ ] ) 
_table_, _key_ will disappear from _bucket_ and _get() returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617516#action_12617516
 ] 

funtick edited comment on SOLR-665 at 7/28/08 12:14 PM:


Mark, thanks for going deeper. Yes, _resize_ may change ( Entry[ ] ) _table_, 
_key_ will disappear from _bucket_ and _get() returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.

- get() will _never_ return wrong value (Entry object is immutable in our case):
{code}
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
{code}



  was (Author: funtick):
Mark, thanks for going deeper. Yes, _resize_ may change ( Entry[ ] ) 
_table_, _key_ will disappear from _bucket_ and _get() returns null:
{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;
...
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
{code}

Might get  _null_ instead of MyValue (during _resize_ by another thread); not a 
big deal! It is still Thread-Safe. Will add new entry (overwrite existing) in 
such extremely rare cases.

- get() will _never_ return wrong value (Entry object is immutable):
{code}
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
{code}


  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
 ] 

funtick edited comment on SOLR-665 at 7/28/08 12:51 PM:


concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

  was (Author: funtick):
concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.

  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
 ] 

funtick edited comment on SOLR-665 at 7/28/08 12:54 PM:


concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

  was (Author: funtick):
concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
 ] 

funtick edited comment on SOLR-665 at 7/28/08 1:04 PM:
---

concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

{code}
/** 
 * Transfer all entries from current table to newTable.
 */
void transfer(Entry[] newTable) {
Entry[] src = table;
int newCapacity = newTable.length;
for (int j = 0; j < src.length; j++) {
Entry e = src[j];
if (e != null) {
src[j] = null;
do {
Entry next = e.next;
int i = indexFor(e.hash, newCapacity);  
e.next = newTable[i];
newTable[i] = e;
e = next;
} while (e != null);
}
}
}
{code}

- We won't have even any NullPointerException after src[j] = null.

P.S.
Of course, I agree - it is Java internals, and it is not public Map 
interface-_contract_ - should we avoid to use implementation then? and base 
decision on specific implementation from SUN? I believe it is specified 
somewhere in JSR too...
{code}
 * @author  Doug Lea
 * @author  Josh Bloch
 * @author  Arthur van Hoff
 * @author  Neal Gafter
 * @version 1.65, 03/03/05
{code}

  was (Author: funtick):
concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> 

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
 ] 

funtick edited comment on SOLR-665 at 7/28/08 1:05 PM:
---

concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

{code}
/** 
 * Transfer all entries from current table to newTable.
 */
void transfer(Entry[] newTable) {
Entry[] src = table;
int newCapacity = newTable.length;
for (int j = 0; j < src.length; j++) {
Entry e = src[j];
if (e != null) {
src[j] = null;
do {
Entry next = e.next;
int i = indexFor(e.hash, newCapacity);  
e.next = newTable[i];
newTable[i] = e;
e = next;
} while (e != null);
}
}
}
{code}

- We won't have even any NullPointerException after src[j] = null.

P.S.
Of course, I agree - it is Java internals, and it is not public Map 
interface-_contract_ - should we avoid to use implementation then? I believe it 
is specified somewhere in JSR too...
{code}
 * @author  Doug Lea
 * @author  Josh Bloch
 * @author  Arthur van Hoff
 * @author  Neal Gafter
 * @version 1.65, 03/03/05
{code}

  was (Author: funtick):
concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

{code}
/** 
 * Transfer all entries from current table to newTable.
 */
void transfer(Entry[] newTable) {
Entry[] src = table;
int newCapacity = newTable.length;
for (int j = 0; j < src.length; j++) {
Entry e = src[j];
if (e != null) {
src[j] = null;
do {
Entry next = e.next;
int i = indexFor(e.hash, newCapacity);  
e.next = newTable[i];
newTable[i] = e;
e = next;
} while (e != null);
}
}
}
{code}

- We won't have even any NullPointerException after src[j] = null.

P.S.
Of course, I agree - it is Java internals, and it is not public Map 
interface-_contract_ - should we avoid to use implementation then? and base 
decision on specific implementation from SUN? I believe it is specified 
somewhere in JSR too..

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
 ] 

funtick edited comment on SOLR-665 at 7/28/08 3:12 PM:
---

concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

{code}
/** 
 * Transfer all entries from current table to newTable.
 */
void transfer(Entry[] newTable) {
Entry[] src = table;
int newCapacity = newTable.length;
for (int j = 0; j < src.length; j++) {
Entry e = src[j];
if (e != null) {
src[j] = null;
do {
Entry next = e.next;
int i = indexFor(e.hash, newCapacity);  
e.next = newTable[i];
newTable[i] = e;
e = next;
} while (e != null);
}
}
}
{code}

- We won't have even any NullPointerException after src[j] = null.

P.S.
Of course, I agree - it is Java internals, and it is not public Map 
interface-_contract_ - should we avoid to use implementation then? I believe it 
is specified somewhere in JSR too...
{code}
 * @author  Doug Lea
 * @author  Josh Bloch
 * @author  Arthur van Hoff
 * @author  Neal Gafter
 * @version 1.65, 03/03/05
{code}

P.P.S.
Do not forget to look at the top of this discussion:
{code}
description: xxx Cache(maxSize=1000, initialSize=1000) 
size : 2668705
cumulative_inserts : 4246246
{code}

- _cumulative_inserts_ is almost double of _size_ which shows that 
double-inserts are real 
- I checked catalina_out: no any NullPointerException, 
ArrayIndexOutOfBoundsException, and etc.
- I don't think we should be worried _too much_ about possible change of Map 
implementation by SUN :P... in this case we should use neither java.lang.String 
nor java.util.Date (some are placed in wrong packages).

  was (Author: funtick):
concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

{code}
/** 
 * Transfer all entries from current table to newTable.
 */
void transfer(Entry[] newTable) {
Entry[] src = table;
int newCapacity = newTable.length;
for (int j = 0; j < src.length; j++) {
Entry e = src[j];
if (e != null) {
src[j] = null;
do {
   

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617529#action_12617529
 ] 

funtick edited comment on SOLR-665 at 7/28/08 3:46 PM:
---

concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

{code}
/** 
 * Transfer all entries from current table to newTable.
 */
void transfer(Entry[] newTable) {
Entry[] src = table;
int newCapacity = newTable.length;
for (int j = 0; j < src.length; j++) {
Entry e = src[j];
if (e != null) {
src[j] = null;
do {
Entry next = e.next;
int i = indexFor(e.hash, newCapacity);  
e.next = newTable[i];
newTable[i] = e;
e = next;
} while (e != null);
}
}
}
{code}

- We won't have even any NullPointerException after src[j] = null.

P.S.
Of course, I agree - it is Java internals, and it is not public Map 
interface-_contract_ - should we avoid to use implementation then? I believe it 
is specified somewhere in JSR too...
{code}
 * @author  Doug Lea
 * @author  Josh Bloch
 * @author  Arthur van Hoff
 * @author  Neal Gafter
 * @version 1.65, 03/03/05
{code}

P.P.S.
Do not forget to look at the top of this discussion:
{code}
description: xxx Cache(maxSize=1000, initialSize=1000) 
size : 2668705
cumulative_inserts : 4246246
{code}

- _cumulative_inserts_ is almost double of _size_ which shows that 
double-inserts are real 
- I checked catalina_out: no any NullPointerException, 
ArrayIndexOutOfBoundsException, and etc.
- I don't think we should be worried _too much_ about possible change of Map 
implementation by SUN :P... in this case we should use neither java.lang.String 
nor java.util.Date (some are placed in wrong packages).
- about thread safety: some participants are wrongly guessing that making 
object access totally synchronized will make their code thread-safe... deadlock.

  was (Author: funtick):
concerns are probably because of misunderstanding of  some _contract_... 

{code}
/**
 * The table, resized as necessary. Length MUST Always be a power of two.
 */
transient Entry[] table;

void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}

Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}


public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)];
 e != null;
 e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}

{code}


- in worst case we will have pointer to _old_ table and even with _new_ one of 
smaller size we won't get _any_ ArrayIndexOutOfBounds.
- There is no any _contract_ requiring synchronization on get() of HashMap or 
LinkedHashMap; it IS application specific.
- we will never have _wrong_ results because Entry is immutable

{code}
/** 
 * Transfer all entries from current table to newTable.
 */
void transfer(Entry[] newTable) {
Entry[] src = table;
int newCapacity = newTable.length;
for (int 

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617639#action_12617639
 ] 

funtick edited comment on SOLR-665 at 7/28/08 6:21 PM:
---

This is extremely simple Concurrent LRU, I spent an hour to create it; it is 
based on ConcurrentHashMap. I don't use java.util.concurrent.locks, and I am 
trying to focus on _requirements only_ avoiding implementing unnecessary 
methods of Map interface (so that I am not following _contract_ ;) very sorry!)

{code}
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

public class ConcurrentLRU {

protected ConcurrentHashMap> map;
protected int maxEntries;

public ConcurrentLRU(int maxEntries) {
map = new ConcurrentHashMap>();
this.maxEntries = maxEntries;
}

public V put(K key, V value) {
ValueWrapper wrapper = map.put(key, new 
ValueWrapper(value));
checkRemove();
return value;
}

void checkRemove() {
if (map.size() <= maxEntries)
return;
Map.Entry> eldestEntry = null;
for (Map.Entry> entry : map.entrySet()) {
long popularity = entry.getValue().popularity;
if (eldestEntry == null || 
eldestEntry.getValue().popularity > popularity) {
eldestEntry = entry;
}
}
map.remove(eldestEntry.getKey(), eldestEntry.getValue());
}

public V get(Object key) {
ValueWrapper wrapper = map.get(key);
return wrapper == null ? null : wrapper.getValue();
}

public final static class ValueWrapper {
static volatile long popularityCounter;
volatile long popularity;
V value;

ValueWrapper(V value) {
this.value = value;
popularity = popularityCounter++;
}

public boolean equals(Object o) {
if (!(o instanceof ValueWrapper)) {
return false;
}
return (value == null ? ((ValueWrapper) o).value == 
null : value.equals(((ValueWrapper) o).value));
}

public int hashCode() {
return value.hashCode();
}

public V getValue() {
popularity = popularityCounter++;
return value;
}

}

}
{code}

  was (Author: funtick):
This is extremely simple Concurrent LRU, I spent an hour to create it; it 
is based on ConcurrentHashMap. I don't use java.util.concurrent.locks, and I am 
trying to focus on _requirements only_ avoiding implementing unnecessary 
methods of Map interface (so that I am not following _contract_ ;) very sorry!)

{code}
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

public class ConcurrentLRU {

protected ConcurrentHashMap> map;
protected int maxEntries;

public ConcurrentLRU(int maxEntries) {
map = new ConcurrentHashMap>();
this.maxEntries = maxEntries;
}

public V put(K key, V value) {
ValueWrapper wrapper = map.put(key, new 
ValueWrapper(value));
checkRemove();
return value;
}

void checkRemove() {
if (map.size() <= maxEntries)
return;
Map.Entry> eldestEntry = null;
long eldestAge = Long.MAX_VALUE;
for (Map.Entry> entry : map.entrySet()) {
long popularity = entry.getValue().popularity;
if (eldestEntry == null || 
eldestEntry.getValue().popularity > popularity) {
eldestEntry = entry;
}
}
map.remove(eldestEntry.getKey(), eldestEntry.getValue());
}

public V get(Object key) {
ValueWrapper wrapper = map.get(key);
return wrapper == null ? null : wrapper.getValue();
}

public final static class ValueWrapper {
static volatile long popularityCounter;
volatile long popularity;
V value;

ValueWrapper(V value) {
this.value = value;
popularity = popularityCounter++;
}

public boolean equals(Object o) {
if (!(o instanceof ValueWrapper)) {
return false;
}
retur

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617639#action_12617639
 ] 

funtick edited comment on SOLR-665 at 7/28/08 6:28 PM:
---

This is extremely simple Concurrent LRU, I spent an hour to create it; it is 
based on ConcurrentHashMap. I don't use java.util.concurrent.locks, and I am 
trying to focus on _requirements only_ avoiding implementing unnecessary 
methods of Map interface (so that I am not following _contract_ ;) very sorry!)

{code}
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

public class ConcurrentLRU {

protected ConcurrentHashMap> map;
protected int maxEntries;

public ConcurrentLRU(int maxEntries) {
map = new ConcurrentHashMap>();
this.maxEntries = maxEntries;
}

public V put(K key, V value) {
ValueWrapper wrapper = map.put(key, new 
ValueWrapper(value));
checkRemove();
return value;
}

void checkRemove() {
if (map.size() <= maxEntries)
return;
Map.Entry> eldestEntry = null;
for (Map.Entry> entry : map.entrySet()) {
long popularity = entry.getValue().popularity;
if (eldestEntry == null || 
eldestEntry.getValue().popularity > popularity) {
eldestEntry = entry;
}
}
map.remove(eldestEntry.getKey(), eldestEntry.getValue());
}

public V get(Object key) {
ValueWrapper wrapper = map.get(key);
return wrapper == null ? null : wrapper.getValue();
}

public final static class ValueWrapper {
static volatile long popularityCounter;
volatile long popularity;
V value;

ValueWrapper(V value) {
this.value = value;
popularity = popularityCounter++;
}

public boolean equals(Object o) {
if (!(o instanceof ValueWrapper)) {
return false;
}
return (value == null ? ((ValueWrapper) o).value == 
null : value.equals(((ValueWrapper) o).value));
}

public int hashCode() {
return value.hashCode();
}

public V getValue() {
popularity = popularityCounter++;
return value;
}

}

}
{code}

P.S.
Do we need to synchronize put() or checkRemove()? The only hypothetical problem 
is OutOfMemoryException. But it is just first draft, very simplified... we do 
not need to sort array.

  was (Author: funtick):
This is extremely simple Concurrent LRU, I spent an hour to create it; it 
is based on ConcurrentHashMap. I don't use java.util.concurrent.locks, and I am 
trying to focus on _requirements only_ avoiding implementing unnecessary 
methods of Map interface (so that I am not following _contract_ ;) very sorry!)

{code}
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

public class ConcurrentLRU {

protected ConcurrentHashMap> map;
protected int maxEntries;

public ConcurrentLRU(int maxEntries) {
map = new ConcurrentHashMap>();
this.maxEntries = maxEntries;
}

public V put(K key, V value) {
ValueWrapper wrapper = map.put(key, new 
ValueWrapper(value));
checkRemove();
return value;
}

void checkRemove() {
if (map.size() <= maxEntries)
return;
Map.Entry> eldestEntry = null;
for (Map.Entry> entry : map.entrySet()) {
long popularity = entry.getValue().popularity;
if (eldestEntry == null || 
eldestEntry.getValue().popularity > popularity) {
eldestEntry = entry;
}
}
map.remove(eldestEntry.getKey(), eldestEntry.getValue());
}

public V get(Object key) {
ValueWrapper wrapper = map.get(key);
return wrapper == null ? null : wrapper.getValue();
}

public final static class ValueWrapper {
static volatile long popularityCounter;
volatile long popularity;
V value;

ValueWrapper(V value) {
this.value = value;
popularity = popularityCounter++;
}

public boolean equals(Object o) {

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617657#action_12617657
 ] 

funtick edited comment on SOLR-665 at 7/28/08 7:36 PM:
---

Lars, I used FIFO because it is extremely simple to get unsynchronized _get()_:
{code}
map = new LinkedHashMap(initialSize, 0.75f, true)  - LRU Cache
(and we need synchronized get())
map = new LinkedHashMap(initialSize, 0.75f, false) - FIFO
(and we do not need synchronized get()) 
{code}

Yonik, I'll try to improve ConcurrentLRU and to share findings... of course 
FIFO is not what we need.

bq. No it doesn't... think linked-list. It moves a single item, which is pretty 
fast.
yes, so I wrote 'evenly distributed between several get() so we can't see it' - 
it keeps List ordered and we can't unsynchronize it with all subsequences!!!

  was (Author: funtick):
Lars, I used FIFO because it is extremely simple to get unsynchronized 
_get()_:
{code}
map = new LinkedHashMap(initialSize, 0.75f, true)  - LRU Cache
(and we need synchronized get())
map = new LinkedHashMap(initialSize, 0.75f, false) - FIFO
(and we do not need synchronized get()) 
{code}

Yonik, I'll try to improve ConcurrentLRU and to share findings... of course 
FIFO is not what we need.
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617654#action_12617654
 ] 

funtick edited comment on SOLR-665 at 7/28/08 8:36 PM:
---

bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 


  was (Author: funtick):
bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 

  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617654#action_12617654
 ] 

funtick edited comment on SOLR-665 at 7/28/08 8:41 PM:
---

bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}

bq. A FIFO cache might become a bottleneck itself - if the cache is very large 
and the most frequently accessed item is inserted just after the cache is 
created, all accesses will need to traverse all the other entries before 
getting that item.

- sorry, I didn't understand... yes, if cache contains 10 entries and 'most 
popular item' is removed... Why 'traverse all the other entries before getting 
that item'? why 9 items are less popular (cumulative) than single one 
(absolute)?


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 


  was (Author: funtick):
bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 

  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617654#action_12617654
 ] 

funtick edited comment on SOLR-665 at 7/28/08 8:46 PM:
---

bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}

bq. A FIFO cache might become a bottleneck itself - if the cache is very large 
and the most frequently accessed item is inserted just after the cache is 
created, all accesses will need to traverse all the other entries before 
getting that item.

- sorry, I didn't understand... yes, if cache contains 10 entries and 'most 
popular item' is removed... Why 'traverse all the other entries before getting 
that item'? why 9 items are less popular (cumulative) than single one 
(absolute)?

You probably mean 'LinkedList traversal' but this is not the case. This is why 
we need to browse JavaSource... LinkedHashMap extends HashMap and there is no 
any 'traversal',
{code}
public V get(Object key) {
Entry e = (Entry)getEntry(key);
if (e == null)
return null;
e.recordAccess(this);
return e.value;
}
{code}


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 


  was (Author: funtick):
bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}

bq. A FIFO cache might become a bottleneck itself - if the cache is very large 
and the most frequently accessed item is inserted just after the cache is 
created, all accesses will need to traverse all the other entries before 
getting that item.

- sorry, I didn't understand... yes, if cache contains 10 entries and 'most 
popular item' is removed... Why 'traverse all the other entries before getting 
that item'? why 9 items are less popular (cumulative) than single one 
(absolute)?


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popul

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617654#action_12617654
 ] 

funtick edited comment on SOLR-665 at 7/28/08 8:48 PM:
---

bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}

bq. A FIFO cache might become a bottleneck itself - if the cache is very large 
and the most frequently accessed item is inserted just after the cache is 
created, all accesses will need to traverse all the other entries before 
getting that item.

- sorry, I didn't understand... yes, if cache contains 10 entries and 'most 
popular item' is removed... Why 'traverse all the other entries before getting 
that item'? why 9 items are less popular (cumulative) than single one 
(absolute)?

You probably mean 'LinkedList traversal' but this is not the case. This is why 
we need to browse JavaSource... LinkedHashMap extends HashMap and there is no 
any 'traversal',
{code}
public V get(Object key) {
Entry e = (Entry)getEntry(key);
if (e == null)
return null;
e.recordAccess(this);
return e.value;
}
{code}


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

bq. That said, if you can show conclusively (e.g. with a profiler) that the 
synchronized access is indeed the bottleneck and incurs a heavy penalty on 
performance, then I'm all for investigating this further.

*What?!!*


Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 


  was (Author: funtick):
bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}

bq. A FIFO cache might become a bottleneck itself - if the cache is very large 
and the most frequently accessed item is inserted just after the cache is 
created, all accesses will need to traverse all the other entries before 
getting that item.

- sorry, I didn't understand... yes, if cache contains 10 entries and 'most 
popular item' is removed... Why 'traverse all the other entries before getting 
that item'? why 9 items are less popular (cumulative) than single one 
(absolute)?

You probably mean 'LinkedList traversal' but this is not the case. This is why 
we need to browse JavaSource... LinkedHashMap extends HashMap and there is no 
any 'traversal',
{code}
public V get(Object key) {
Entry e = (Entry)getEntry(key);
if (e == null)
return null;
e.recordAccess(this);
retur

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617684#action_12617684
 ] 

funtick edited comment on SOLR-665 at 7/28/08 9:20 PM:
---

bq. Fuad is after fastest-possible reads, everybody is after reasonable 
behavior in the face of concurrent writes

Thanks and sorry for runtime errors;

FIFO looks strange at first, but... for large cache (10 items), most 
popular item can be _mistakenly_ removed... but I don't think there are any 
'most popular facets' etc.; it's evenly distributed in most cases.

Another issue: SOLR always tries _recalculate_ _facets_ even with extremely 
large filterCache & queryResultCache, even the same faceted query shows always 
the same long response times.

bq. It is if nothing is modifying the map during the get. If something is 
modifying the map you don't know how the implementation handles the insert of a 
new value. It might copy the object, and you'd end up with half an object or 
even an invalid memory location. That's why the javadoc says that you must 
synchronize accesses if anything modifies the map - this is not limited to 
iterators.

I agree of course... However, we are not dealing with unknown implementation of 
java.util.Map clonig (java.lang.Cloneable) objects somehow or using some weird 
object introspection etc 

  was (Author: funtick):
bq. Fuad is after fastest-possible reads, everybody is after reasonable 
behavior in the face of concurrent writes

Thanks and sorry for runtime errors;

FIFO looks strange at first, but... for large cache (10 items), most 
popular item can be _mistakenly_ removed... but I don't think there are any 
'most popular facets' etc.; it's evenly distributed in most cases.

Another issue: SOLR always tries _recalculate_ _facets_ even with extremely 
large filterCache & queryResultCache, even the same faceted query shows always 
the same long response times.

  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-29 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618064#action_12618064
 ] 

noble.paul edited comment on SOLR-665 at 7/29/08 10:08 PM:
---

There are a few obvious issues w/ your patch.
Effendi your implementation is wrong

{code:java}
 if (wrapper.lastAccessed < pointer.lastAccessed) {
ref.set(new Pointer(key, wrapper.lastAccessed));
}
{code}
will always evaluate to false. And the reference will always have one value
We must be removing the entry which was accessed first (not last).. To get that 
you must maintian a linkedlist the way linkedhashmap maintains. No other 
shortcut. Keeping one reference *will not* work. 

In that implementation there is always a contention on the head so gets are 
bound to be slow. That is why i did not go that route
And the static volatile counter is not threadsafe. 
static in general is not a good idea


There is no need to use a WeakReference anywhere. This is a cache. So it *must* 
maintain strong reference

  was (Author: noble.paul):
There are a few obvious issues w/ your patch.
Effendi your implementation is wrong

{code:java}
 if (wrapper.lastAccessed < pointer.lastAccessed) {
ref.set(new Pointer(key, wrapper.lastAccessed));
}
{code}
will always evaluate to true. We must be removing the entry which was accessed 
first (not last)
. To get that you must maintian a linkedlist the way linkedhashmap maintains. 
No other shortcut. Keeping one reference *will not* work. 

In that implementation there is always a contention on the head so gets are 
bound to be slow. That is why i did not go that route
And the static volatile counter is not threadsafe.


There is no need to use a WeakReference anywhere. This is a cache. So it *must* 
maintain strong reference
  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, 
> ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-30 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618339#action_12618339
 ] 

funtick edited comment on SOLR-665 at 7/30/08 7:06 AM:
---

Noble, thanks for feedback!

Of course my code is buggy but I only wanted _to illustrate_ simplest idea; I 
am extremely busy with other staff (Liferay) and can't focus on SOLR 
improvements... may be during weekend.

bq. ...will always evaluate to false. And the reference will always have one 
value
- yes, this is bug. There are other bugs too...

bq. We must be removing the entry which was accessed first (not last).. 
I mean (and code too) the same; probably wrong wording

bq. And the static volatile counter is not threadsafe. 
Do we _really-really_ need thread safety here? By using 'volatile' I only 
prevent _some_ JVMs from trying to optimize some code (and cause problems with 
per-instance variables which never change).

bq. There is no need to use a WeakReference anywhere
Agree... 

bq. To get that you must maintian a linkedlist the way linkedhashmap maintains. 
No other shortcut. 
May be... but looks similar to Arrays.sort(), or TreeSet, and etc I am 
trying to avoid this. 'No other shortcut' - may be, but I am unsure.

Thanks!



  was (Author: funtick):
Nobble, thanks for feedback!

Of course my code is buggy but I only wanted _to illustrate_ simplest idea; I 
am extremely busy with other staff (Liferay) and can't focus on SOLR 
improvements... may be during weekend.

bq. ...will always evaluate to false. And the reference will always have one 
value
- yes, this is bug. There are other bugs too...

bq. We must be removing the entry which was accessed first (not last).. 
I mean (and code too) the same; probably wrong wording

bq. And the static volatile counter is not threadsafe. 
Do we _really-really_ need thread safety here? By using 'volatile' I only 
prevent _some_ JVMs from trying to optimize some code (and cause problems with 
per-instance variables which never change).

bq. There is no need to use a WeakReference anywhere
Agree... 

bq. To get that you must maintian a linkedlist the way linkedhashmap maintains. 
No other shortcut. 
May be... but looks similar to Arrays.sort(), or TreeSet, and etc I am 
trying to avoid this. 'No other shortcut' - may be, but I am unsure.

Thanks!


  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, 
> ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-30 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12618339#action_12618339
 ] 

funtick edited comment on SOLR-665 at 7/30/08 7:08 AM:
---

Noble, thanks for feedback!

Of course my code is buggy but I only wanted _to illustrate_ simplest idea; I 
am extremely busy with other staff (Liferay) and can't focus on SOLR 
improvements... may be during weekend.

bq. ...will always evaluate to false. And the reference will always have one 
value
- yes, this is bug. There are other bugs too...

bq. We must be removing the entry which was accessed first (not last).. 
I mean (and code too) the same; probably wrong wording

bq. And the static volatile counter is not threadsafe. 
Do we _really-really_ need thread safety here? By using 'volatile' I only 
prevent _some_ JVMs from trying to optimize some code (and cause problems).

bq. There is no need to use a WeakReference anywhere
Agree... 

bq. To get that you must maintian a linkedlist the way linkedhashmap maintains. 
No other shortcut. 
May be... but looks similar to Arrays.sort(), or TreeSet, and etc I am 
trying to avoid this. 'No other shortcut' - may be, but I am unsure.

Thanks!



  was (Author: funtick):
Noble, thanks for feedback!

Of course my code is buggy but I only wanted _to illustrate_ simplest idea; I 
am extremely busy with other staff (Liferay) and can't focus on SOLR 
improvements... may be during weekend.

bq. ...will always evaluate to false. And the reference will always have one 
value
- yes, this is bug. There are other bugs too...

bq. We must be removing the entry which was accessed first (not last).. 
I mean (and code too) the same; probably wrong wording

bq. And the static volatile counter is not threadsafe. 
Do we _really-really_ need thread safety here? By using 'volatile' I only 
prevent _some_ JVMs from trying to optimize some code (and cause problems with 
per-instance variables which never change).

bq. There is no need to use a WeakReference anywhere
Agree... 

bq. To get that you must maintian a linkedlist the way linkedhashmap maintains. 
No other shortcut. 
May be... but looks similar to Arrays.sort(), or TreeSet, and etc I am 
trying to avoid this. 'No other shortcut' - may be, but I am unsure.

Thanks!


  
> FIFO Cache (Unsynchronized): 9x times performance boost
> ---
>
> Key: SOLR-665
> URL: https://issues.apache.org/jira/browse/SOLR-665
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
> Environment: JRockit R27 (Java 6)
>Reporter: Fuad Efendi
> Attachments: ConcurrentFIFOCache.java, ConcurrentFIFOCache.java, 
> ConcurrentLRUCache.java, ConcurrentLRUWeakCache.java, 
> ConcurrentLRUWeakCache.java, ConcurrentLRUWeakCache.java, FIFOCache.java
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Attached is modified version of LRUCache where 
> 1. map = new LinkedHashMap(initialSize, 0.75f, false) - so that 
> "reordering"/true (performance bottleneck of LRU) is replaced to 
> "insertion-order"/false (so that it became FIFO)
> 2. Almost all (absolutely unneccessary) synchronized statements commented out
> See discussion at 
> http://www.nabble.com/LRUCache---synchronized%21--td16439831.html
> Performance metrics (taken from SOLR Admin):
> LRU
> Requests: 7638
> Average Time-Per-Request: 15300
> Average Request-per-Second: 0.06
> FIFO:
> Requests: 3355
> Average Time-Per-Request: 1610
> Average Request-per-Second: 0.11
> Performance increased 9 times which roughly corresponds to a number of CPU in 
> a system, http://www.tokenizer.org/ (Shopping Search Engine at Tokenizer.org)
> Current number of documents: 7494689
> name:  filterCache  
> class:org.apache.solr.search.LRUCache  
> version:  1.0  
> description:  LRU Cache(maxSize=1000, initialSize=1000)  
> stats:lookups : 15966954582
> hits : 16391851546
> hitratio : 0.102
> inserts : 4246120
> evictions : 0
> size : 2668705
> cumulative_lookups : 16415839763
> cumulative_hits : 16411608101
> cumulative_hitratio : 0.99
> cumulative_inserts : 4246246
> cumulative_evictions : 0 
> Thanks

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-30 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617654#action_12617654
 ] 

funtick edited comment on SOLR-665 at 7/30/08 10:07 AM:


bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}
https://issues.apache.org/jira/browse/SOLR-669

bq. A FIFO cache might become a bottleneck itself - if the cache is very large 
and the most frequently accessed item is inserted just after the cache is 
created, all accesses will need to traverse all the other entries before 
getting that item.

- sorry, I didn't understand... yes, if cache contains 10 entries and 'most 
popular item' is removed... Why 'traverse all the other entries before getting 
that item'? why 9 items are less popular (cumulative) than single one 
(absolute)?

You probably mean 'LinkedList traversal' but this is not the case. This is why 
we need to browse JavaSource... LinkedHashMap extends HashMap and there is no 
any 'traversal',
{code}
public V get(Object key) {
Entry e = (Entry)getEntry(key);
if (e == null)
return null;
e.recordAccess(this);
return e.value;
}
{code}


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

bq. That said, if you can show conclusively (e.g. with a profiler) that the 
synchronized access is indeed the bottleneck and incurs a heavy penalty on 
performance, then I'm all for investigating this further.

*What?!!*


Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 


  was (Author: funtick):
bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}

bq. A FIFO cache might become a bottleneck itself - if the cache is very large 
and the most frequently accessed item is inserted just after the cache is 
created, all accesses will need to traverse all the other entries before 
getting that item.

- sorry, I didn't understand... yes, if cache contains 10 entries and 'most 
popular item' is removed... Why 'traverse all the other entries before getting 
that item'? why 9 items are less popular (cumulative) than single one 
(absolute)?

You probably mean 'LinkedList traversal' but this is not the case. This is why 
we need to browse JavaSource... LinkedHashMap extends HashMap and there is no 
any 'traversal',
{code}
public V get(Object key) {
Entry e = (Entry)getEntry(key);
if (e == null)
return 

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-30 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617654#action_12617654
 ] 

funtick edited comment on SOLR-665 at 7/30/08 10:08 AM:


bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color} - SOLR-669
http://issues.apache.org/jira/browse/SOLR-669

bq. A FIFO cache might become a bottleneck itself - if the cache is very large 
and the most frequently accessed item is inserted just after the cache is 
created, all accesses will need to traverse all the other entries before 
getting that item.

- sorry, I didn't understand... yes, if cache contains 10 entries and 'most 
popular item' is removed... Why 'traverse all the other entries before getting 
that item'? why 9 items are less popular (cumulative) than single one 
(absolute)?

You probably mean 'LinkedList traversal' but this is not the case. This is why 
we need to browse JavaSource... LinkedHashMap extends HashMap and there is no 
any 'traversal',
{code}
public V get(Object key) {
Entry e = (Entry)getEntry(key);
if (e == null)
return null;
e.recordAccess(this);
return e.value;
}
{code}


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

bq. That said, if you can show conclusively (e.g. with a profiler) that the 
synchronized access is indeed the bottleneck and incurs a heavy penalty on 
performance, then I'm all for investigating this further.

*What?!!*


Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 


  was (Author: funtick):
bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color}
https://issues.apache.org/jira/browse/SOLR-669

bq. A FIFO cache might become a bottleneck itself - if the cache is very large 
and the most frequently accessed item is inserted just after the cache is 
created, all accesses will need to traverse all the other entries before 
getting that item.

- sorry, I didn't understand... yes, if cache contains 10 entries and 'most 
popular item' is removed... Why 'traverse all the other entries before getting 
that item'? why 9 items are less popular (cumulative) than single one 
(absolute)?

You probably mean 'LinkedList traversal' but this is not the case. This is why 
we need to browse JavaSource... LinkedHashMap extends HashMap and there is no 
any 'traversal',
{code}
public V get(Object key) {
Entry e = (Entry)

[jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-30 Thread Fuad Efendi (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617654#action_12617654
 ] 

funtick edited comment on SOLR-665 at 7/30/08 10:08 AM:


bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color} - SOLR-669

bq. A FIFO cache might become a bottleneck itself - if the cache is very large 
and the most frequently accessed item is inserted just after the cache is 
created, all accesses will need to traverse all the other entries before 
getting that item.

- sorry, I didn't understand... yes, if cache contains 10 entries and 'most 
popular item' is removed... Why 'traverse all the other entries before getting 
that item'? why 9 items are less popular (cumulative) than single one 
(absolute)?

You probably mean 'LinkedList traversal' but this is not the case. This is why 
we need to browse JavaSource... LinkedHashMap extends HashMap and there is no 
any 'traversal',
{code}
public V get(Object key) {
Entry e = (Entry)getEntry(key);
if (e == null)
return null;
e.recordAccess(this);
return e.value;
}
{code}


bq. Consider the following case: thread A performs a synchronized put, thread B 
performs an unsynchronized get on the same key. B gets scheduled before A 
completes, the returned value will be undefined.
the returned value is well defined: it is either null or correct value.

bq. That's exactly the case here - the update thread modifies the map 
structurally! 
Who cares? We are not iterating the map!

bq. That said, if you can show conclusively (e.g. with a profiler) that the 
synchronized access is indeed the bottleneck and incurs a heavy penalty on 
performance, then I'm all for investigating this further.

*What?!!*


Anyway, I believe simplified ConcurrentLRU backed by ConcurrentHashMap is 
easier to understand and troubleshoot...

bq. I don't see the point of the static popularityCounter... that looks like a 
bug.
No, it is not a bug. it is virtually "checkpoint", like as a timer, one timer 
for all instances. We can use System.currentTimeMillis() instead, but static 
volatile long is faster.

About specific use case: yes... if someone has 0.5 seconds response time for 
faceted queries I am very happy... I had 15 seconds before going with FIFO. 


  was (Author: funtick):
bq. The Solr admin pages will not give you exact measurements. 
Yes, and I do not need exact measurements! It gives me averageTimePerRequest 
which improved almost 10 times on production server. Should I right JUnit tests 
and execute it in a single-threaded environment? Better is to use The Grinder, 
but I don't have time and spare CPUs.

bq. I've seen throughputs in excess of 400 searches per second. 
But 'searches per second' is not the same as 'average response time'!!!

bq. Are you using highlighting or anything else that might be CPU-intensive at 
all? 
Yes, I am using highlighting. You can see it at http://www.tokenizer.org


bq. I'm guessing that you're caching the results of all queries in memory such 
that no disk access is necessary.
{color:red} But this is another bug of SOLR!!! I am using extremely large 
caches but SOLR still *recalculates* facet intersections. {color} - SOLR-669
http://issues.apache.org/jira/browse/SOLR-669

bq. A FIFO cache might become a bottleneck itself - if the cache is very large 
and the most frequently accessed item is inserted just after the cache is 
created, all accesses will need to traverse all the other entries before 
getting that item.

- sorry, I didn't understand... yes, if cache contains 10 entries and 'most 
popular item' is removed... Why 'traverse all the other entries before getting 
that item'? why 9 items are less popular (cumulative) than single one 
(absolute)?

You probably mean 'LinkedList traversal' but this is not the case. This is why 
we need to browse JavaSource... LinkedHashMap extends HashMap and there is no 
any 'traversal',
{code}
public V get(Object key) {
Entry e = (Entry)getEntry(key);
if (e == null

Re: [jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-28 Thread Yonik Seeley
Fuad, please stop editing entries that have already have responses...
it makes it very difficult to follow things.

-Yonik


Re: [jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-29 Thread Andrew Savory
Hi,

2008/7/29 Yonik Seeley <[EMAIL PROTECTED]>:
> Fuad, please stop editing entries that have already have responses...
> it makes it very difficult to follow things.

Actually I'd argue that all such technical discussion would be better
done on the mailing list rather than through JIRA. Mail clients are
designed for threaded discussions far better than JIRA's web GUI. And
JIRA's posting back to the list with bq. makes most responses
impossible to follow. Excessive use of JIRA feels like a community
antipattern to me.

Just my $0.02


Andrew.
--
[EMAIL PROTECTED] / [EMAIL PROTECTED]
http://www.andrewsavory.com/


Re: [jira] Issue Comment Edited: (SOLR-665) FIFO Cache (Unsynchronized): 9x times performance boost

2008-07-29 Thread Mike Klaas


On 29-Jul-08, at 3:20 AM, Andrew Savory wrote:


Actually I'd argue that all such technical discussion would be better
done on the mailing list rather than through JIRA. Mail clients are
designed for threaded discussions far better than JIRA's web GUI. And
JIRA's posting back to the list with bq. makes most responses
impossible to follow. Excessive use of JIRA feels like a community
antipattern to me.


+1

-Mike